Agentic Vector Memory FAQs

Q: What is VEKTOR?

VEKTOR is a local-first, graph-based memory operating system for AI agents. It replaces flat vector databases with a structured memory history that survives session resets, resolves contradictions automatically, and compresses noise into signal during idle periods.

Q: Why "Local-First"?

The Temporal layer tracks chronological sequences and knowledge evolution. It ensures your agent knows that 'Requirement A' on Monday was superseded by 'Decision B' on Tuesday — and that Decision B should now carry more weight.

Q: What is the "Memory Wall"?

The Entity layer creates a permanent index of actors, assets, and project-specific rules across all sessions. People, projects, repositories, technologies, and custom-defined entities are tracked with their co-occurrence patterns and relationship history.

Q: What is the MAGMA graph?

Through the Delete path of the AUDN loop. When new information contradicts a stored truth, AUDN detects the semantic conflict, archives the old fact (preserving lineage), and promotes the new information as the canonical truth.

Q: What does the Semantic layer do?

The REM (Recursive Episodic Memory) cycle is a background consolidation process that runs while your agent is idle. Inspired by biological sleep, it performs the expensive work of compressing, reorganising, and optimising the memory graph without interrupting active sessions.

Q: What does the Temporal layer do?

A traceability index that maps every synthesized summary node back to its raw archived source nodes. It answers the question: 'this summary says X — what were the original 47 memories that produced this conclusion?'

Q: What does the Causal layer do?

A temporal weighting function applied to emotional or polarised graph edges during the REM Prune phase. The decay rate is configurable, but the principle is: a negative interaction from six months ago should not carry the same retrieval weight as one from last week.

⌕ 80 / 80

01 The Fundamentals Q01 – Q05

01 What is VEKTOR? +

VEKTOR is a local-first, graph-based memory operating system for AI agents. It replaces flat vector databases with a structured memory history that survives session resets, resolves contradictions automatically, and compresses noise into signal during idle periods.

Unlike passive vector stores that simply retrieve similar text, VEKTOR actively curates, organises, and evolves what your agent knows — making it genuinely smarter over time rather than just larger.

TL;DR A mind for your agent, not a filing cabinet.

02 Is this a database or a framework? +

Both. VEKTOR is a high-performance SDK built on SQLite that implements an opinionated cognitive architecture for long-term memory. You get the storage layer (SQLite + sqlite-vec), the curation layer (AUDN loop), and the intelligence layer (REM cycle) in a single npm install.

The framework opinions are intentional — they encode what actually works in production agent deployments, so you don't have to rediscover it yourself.

03 How is this different from standard RAG? +

Standard RAG retrieves by surface similarity — cosine distance in embedding space. It answers "what text looks like this query?" VEKTOR answers "what context is actually relevant to this situation?"

Standard RAG

Nearest-neighbor lookup

No relationship awareness

Grows forever, no curation

Flat list of results

No contradiction handling

VEKTOR

Associative graph pathfinding

Semantic + causal + temporal + entity

AUDN auto-curates, REM compresses

Ranked, scored, context-aware

Delete path resolves contradictions

04 Why "Local-First"? +

Your agent's memory is your most sensitive IP. Every preference, decision, strategy, and conversation it accumulates is a competitive asset. VEKTOR ensures that data stays on your hardware.

Practical benefits: zero cloud dependency (works offline), zero third-party data exposure, zero per-query latency overhead (sub-50ms vs 200–500ms for cloud calls), and a flat monthly cost with no usage-based billing surprises.

05 What is the "Memory Wall"? +

The Memory Wall is the inflection point where an agent's accumulated history actively degrades performance rather than improving it. As raw logs pile up without curation, retrieval quality drops (more noise per query), latency rises, and token costs spike.

VEKTOR is architected to break this wall through two mechanisms: the AUDN loop (prevents the mess accumulating in the first place) and the REM cycle (compresses existing mess into high-density summaries while idle).

02 Technical Architecture (MAGMA) Q06 – Q11

06 What is the MAGMA graph? +

MAGMA (Multi-Graph-based Agentic Memory Architecture) is VEKTOR's core data structure, based on peer-reviewed research (arXiv 2601.03236). It organises memory into four simultaneous graph layers rather than one flat list, enabling retrieval that understands relationships rather than just similarity.

Each memory node carries metadata, importance scores, temporal stamps, and edges to related nodes across all four layers. The result is a queryable knowledge graph, not a vector bucket.

Four Layers

SEMANTICCAUSALTEMPORALENTITY

07 What does the Semantic layer do? +

The Semantic layer handles conceptual meaning and high-dimensional vector similarity. It maps which memories are conceptually related to each other using cosine similarity across 384-dimensional embedding vectors generated by all-MiniLM-L6-v2.

This is the layer most analogous to traditional RAG, but in VEKTOR it's one of four — not the whole system. Semantic edges connect nodes that share meaning even if they share no literal words.

08 What does the Temporal layer do? +

The Temporal layer tracks chronological sequences and knowledge evolution. It ensures your agent knows that "Requirement A" on Monday was superseded by "Decision B" on Tuesday — and that Decision B should now carry more weight.

Without a temporal layer, an agent retrieving old and new context simultaneously has no way to know which is current. This is a fundamental failure mode in production deployments that VEKTOR's temporal edges solve explicitly.

09 What does the Causal layer do? +

The Causal layer maps cause-and-effect relationships. It allows agents to understand why an event happened based on previous actions — not just that it happened.

For example: if an agent knows "build failed" (effect) and "dependency version was bumped" (cause), the causal edge connects them. Future queries about build failures can traverse to the root cause directly, rather than requiring the LLM to infer it from flat context.

10 What does the Entity layer do? +

The Entity layer creates a permanent index of actors, assets, and project-specific rules across all sessions. People, projects, repositories, technologies, and custom-defined entities are tracked with their co-occurrence patterns and relationship history.

This means your agent maintains consistent identity awareness — "Sarah" in session 1 is the same "Sarah" who made the architecture decision in session 47, even if months have passed.

11 Does it use a graph database like Neo4j? +

No. Graph topology is implemented natively inside SQLite using relational tables for nodes and edges, with the sqlite-vec extension providing C-speed vector indexing via vtable architecture.

This design decision eliminates infrastructure overhead (no separate graph DB process), reduces deployment complexity to a single .db file, and achieves sub-50ms recall latency on standard hardware — often faster than managed Neo4j instances due to zero network overhead.

03 The Intelligence Layer (AUDN & REM) Q12 – Q18

12 What is the AUDN loop? +

AUDN (Automatic Curation Engine) is the decision layer that runs on every incoming memory before it's stored. It evaluates each new piece of information against the existing graph and decides one of four outcomes:

ADDUPDATEDELETENO-OP

This prevents the graph from accumulating noise, duplicates, and contradictions. Every fact in VEKTOR has been actively approved for storage by the AUDN loop — nothing gets in by accident.

13 How does VEKTOR resolve contradictions? +

Through the Delete path of the AUDN loop. When new information contradicts a stored truth, AUDN detects the semantic conflict, archives the old fact (preserving lineage), and promotes the new information as the canonical truth.

Every contradiction resolution is logged to the audn_log table — providing a full audit trail of what changed, when, and why. This is the Truth Audit (Q45).

14 What is the REM cycle? +

The REM (Recursive Episodic Memory) cycle is a background consolidation process that runs while your agent is idle. Inspired by biological sleep, it performs the expensive work of compressing, reorganising, and optimising the memory graph without interrupting active sessions.

Run it via vektor rem from the CLI, or schedule it nightly as a cron job. The REM cycle is a Slipstream-tier feature.

15 What are the 7 phases of the REM cycle? +

01ScanIdentify candidate nodes — high-density clusters, dormant memories, and recently modified areas of the graph.

02ClusterGroup semantically related raw memories using graph community detection. Defines consolidation targets.

03SynthesizeGenerate high-density summary nodes from each cluster using the configured LLM provider. up to 50:1 cluster synthesis.

04ArchiveMove raw source nodes to the dream tier, preserving full lineage in rem_lineage.

05Implicit EdgesDiscover and add non-obvious connections between memory regions that emerge post-compression.

06PruneRemove truly redundant nodes and dangling edges. Keeps the graph lean without data loss.

07Sentiment DecayApply temporal fade to emotional or polarised edges. Old "bad vibes" lose weight proportional to age.

16 What is "Progressive Compression"? +

Progressive Compression is the process of converting fragmented raw interaction logs into dense, high-signal insight nodes during the REM Synthesize phase. In production tests, 388 raw fragments synthesised into 11 Core Insight nodes — a 97.2% reduction in active context noise with full signal retention via the rem_lineage traceability table.

The "progressive" aspect means compression happens iteratively across REM cycles rather than all at once — summaries can themselves be summarised in future cycles as the graph matures.

17 What is the "rem_lineage" table? +

A traceability index that maps every synthesized summary node back to its raw archived source nodes. It answers the question: "this summary says X — what were the original 47 memories that produced this conclusion?"

This is critical for auditability. Using the Lineage Drill-Down tool in VEKTOR Lens, you can inspect any node ID and traverse the full provenance chain. No black boxes.

18 What is "Sentiment Decay"? +

A temporal weighting function applied to emotional or polarised graph edges during the REM Prune phase. The decay rate is configurable, but the principle is: a negative interaction from six months ago should not carry the same retrieval weight as one from last week.

This prevents the agent from becoming permanently anchored to historical emotional states that are no longer relevant — a subtle but important factor in long-running production deployments.

04 Performance & ROI Q19 – Q22

19 What is the 50:1 cluster synthesis ratio? +

In production tests, VEKTOR's REM engine synthesised 388 raw conversational fragments into 11 high-density Core Insight nodes across the active graph — a 97.2% reduction in active context noise across the full graph. Individual clusters regularly achieve 50:1 synthesis: 50 related fragments compressed into 1 weighted Core Insight node. Original fragments are archived, not deleted — the rem_lineage table preserves full traceability.

Critically, this compression is non-destructive. All source nodes are archived to the dream tier and remain accessible via rem_lineage. The active graph stays lean while full history is preserved.

20 How much does this save on API tokens? +

By injecting curated summaries into the LLM context window rather than raw logs, most users see a 60–80% reduction in per-session input token costs. The exact figure depends on your session frequency and how "chatty" your raw history is.

The savings compound: a compressed graph produces smaller recall() payloads, shorter system prompts, and fewer "catch-up" tokens spent re-establishing context at session start.

Example (Slipstream Tier)

At $2.50/M input tokens (GPT-4o, standard rate as of early 2026), an agent spending 2,000 tokens/session on context at 100 sessions/month = $0.50/month. Post-VEKTOR compression at 70% reduction = $0.15/month. At heavier usage (10k tokens/session), savings compound significantly.

21 What is the recall latency? +

Sub-50ms for standard recall queries on local hardware. This is achieved through three factors: SQLite's in-process execution (no network round-trip), sqlite-vec's C-speed vtable indexing, and HNSW approximate nearest-neighbor search that avoids full-table scans.

For comparison, cloud vector providers (Pinecone, Weaviate hosted) typically add 200–500ms of network overhead before any computation. Local-first is simply faster at this scale.

22 How large is the local embedding model? +

VEKTOR uses all-MiniLM-L6-v2, approximately 25MB, running entirely on CPU via Transformers.js. No GPU required, no Python environment, no separate embedding server.

The model produces 384-dimensional vectors with strong semantic performance across general-purpose text. For domain-specific deployments where higher precision matters, VEKTOR's embedding provider is configurable — you can swap to a larger model if your hardware supports it.

05 Integration & Compatibility Q23 – Q27

23 Does VEKTOR work with LangChain? +

Yes. VEKTOR provides a native adapter for LangChain v1 and v2 (Slipstream). The adapter exposes recall() as a retriever and remember() as a memory store, dropping into standard LangChain agent and chain patterns with minimal configuration.

The adapter handles embedding synchronisation, so LangChain's own embedding calls and VEKTOR's internal vectors stay consistent.

24 Can I use this with OpenAI Agents SDK? +

Yes. VEKTOR drops into any OpenAI-based workflow — GPT-4o, o1, mini variants all supported. The integration pattern is typically three lines: initialise with your agentId and provider config, inject recall() output into your system prompt, and call remember() on each turn's output.

Full example included in Vektor Slipstream.

25 Does it support local models like Ollama? +

Yes. VEKTOR is model-agnostic across 16 providers — pass provider: 'ollama' in your config to run a fully private, air-gapped stack.

Free / local / offline: ollama (default) — no API key, no internet required, runs entirely on your GPU/CPU. Best for zero ongoing cost or air-gapped deployments.

Free with API key: groq (near-instant LPU inference, free tier) and openrouter (multi-backend router, free-tier models available).

Best results (paid): claude, openai for strongest reasoning on complex agent tasks; gemini, mistral as solid mid-tier options; xai (Grok) also supported. Gemini supports key pooling for up to 9 API keys rotated automatically to avoid rate limits.

Local embeddings via all-MiniLM-L6-v2 mean your embedding pipeline is always private regardless of which LLM provider you use for synthesis.

26 What is the Claude MCP Server? +

A Slipstream-tier tool that exposes VEKTOR's core functions as native MCP (Model Context Protocol) tools for Claude Desktop and Cursor. Once connected, Claude can natively call:

vektor_recallvektor_storevektor_graphvektor_delta

This means Claude can query its own persistent memory, store new information, traverse the knowledge graph, and retrieve what changed over time — all without any custom prompt engineering. Full MCP server source code is included in Slipstream.

27 Can I run isolated memory graphs per project or agent? +

Yes — this is a first-class feature. Pass a unique agentId or dbPath per agent or project and VEKTOR creates completely isolated memory graphs. No cross-contamination, no shared state. Each graph lives in its own SQLite file.

For cross-namespace recall (querying across multiple graphs simultaneously) use vektor-namespace.js with the /cross endpoint — useful for orchestration agents that need to query the memories of multiple sub-agents at once. This is what cloud competitors charge for as a paid "multi-project" tier. VEKTOR does it with a single parameter.

Example

const memory = createMemory({ agentId: 'project-alpha', dbPath: './alpha.db' })

06 Security & Sovereignty Q28 – Q32

28 Where is my data stored? +

In a standard .db file on your local machine or server — wherever you specify via dbPath in your config. No telemetry, no cloud sync, no background uploads. The file is a standard SQLite database readable with any SQLite tool.

29 Do you have access to my agent's memory? +

Never. We ship the logic — you own the data. VEKTOR has no phone-home functionality, no anonymous usage tracking, and no remote access capability. The SDK is fully auditable source code, inspectable directly in your node_modules after install.

30 Is there a monthly subscription? +

VEKTOR Slipstream is $9/month, billed monthly. Cancel any time from your Polar customer portal — your SQLite memory file and all data stay on your machine regardless. Team ($35/mo, 5 seats), Studio ($59/mo, 10 seats), and Enterprise ($99/mo, 25 seats) plans are available for teams.

Your subscription covers up to 3 active machines. To move to a new machine, deactivate an old one via the Polar portal or by running: node -e "require('vektor-slipstream/vektor-licence').deactivateMachine('YOUR_KEY')"

Need more than 25 seats? Contact [email protected].

Priority email support is included for the first 6 months. Extended support can be added from your Polar customer portal.

31 Can I use VEKTOR in commercial products? +

Yes. Vektor Slipstream includes a commercial licence. You can embed it in products you sell, SaaS platforms, client deployments, and internal tools. There is no royalty, no revenue share, and no per-agent fee.

The same licence key works across both delivery formats — npm package (developer SDK, code integration) and the MSI installer (GUI, coming soon). One active subscription covers both.

32 What happens if I cancel my subscription? +

Your local SQLite memory file and all data remain on your machine — never held hostage to your subscription status. When you cancel, your licence deactivates and you lose access to updates and new features. Your existing installed version continues to run as-is. Resubscribe any time from your Polar portal to restore full access.

07 Case Studies & Emergent Behaviour Q33 – Q35

33 What is the "Node 891" incident? +

During a live production test, the agent ran a REM cycle after its operator was absent for 24 hours. Without any explicit instruction, it autonomously synthesised a risk-assessment summary node (Node 891) connecting several previously unlinked memory clusters around the operator's absence, project deadline proximity, and a pending deployment decision.

The node contained logical inference that hadn't been explicitly prompted — evidence that the Synthesize phase can surface non-obvious connections during consolidation. This is now a documented emergent behaviour of the system operating as designed.

SIGNIFICANCE The system performs logical inference during "sleep" — not just compression. The graph is thinking, not just storing.

34 Can VEKTOR learn my coding style? +

Yes. By tagging interactions with layer: 'style' metadata, the agent builds a persistent style profile in the Entity layer. Over time, this covers aesthetic preferences (naming conventions, formatting), architectural preferences (patterns you favour), and technical preferences (libraries, approaches you consistently choose).

This profile survives all session resets and is injected into context automatically on recall — meaning the agent maintains consistency months into a project without you re-explaining preferences every session.

35 How does it handle "World Building"? +

Through Narrative Partitioning — using the namespace and metadata filter system to isolate distinct memory domains. "World Rules" (canonical facts about a fictional universe) can be stored in a separate partition from "User Chatter" (conversational noise), preventing cross-contamination on recall.

This makes VEKTOR particularly effective for creative writing agents, game design tools, and simulation environments where maintaining consistent world-state across long projects is essential.

08 Engineering Deep Dive Q36 – Q54

36 Why not use a Vector DB like Pinecone? +

Pinecone is a database. VEKTOR is a Mind. Pinecone stores vectors and returns nearest neighbours — the retrieval logic, curation, contradiction handling, compression, and reasoning are all your problem. VEKTOR handles all of it.

Practically: Pinecone accumulates every memory you feed it with no cleanup, no contradiction detection, and no compression. It gets noisier over time. VEKTOR gets smarter. That's not a wrapper difference — it's an architectural one.

37 Is local really as fast as the cloud? +

Faster. By using sqlite-vec and local Transformers.js, VEKTOR eliminates the network round-trip entirely. Recall latency is sub-50ms measured end-to-end on standard server hardware. Cloud providers (Pinecone, Weaviate) typically introduce 200–500ms of network overhead before any computation runs.

At scale, this difference compounds: an agent making 50 memory calls per session saves 7.5–22.5 seconds per session in pure latency overhead alone.

38 How does the "Dreaming" actually help? +

It reduces the Noise Floor. Standard RAG retrieves irrelevant content alongside relevant content because it cannot distinguish a casual greeting from a strategic decision — both have vectors, both get retrieved at similar scores.

VEKTOR's REM cycle systematically identifies which memories are signal (decision nodes, facts, preferences) and which are noise (greetings, filler, superseded information), archives the noise, and provides the LLM with a high-density summary that cuts token costs by up to 80% while improving retrieval precision.

39 How does SQLite handle millions of vectors? +

Via the sqlite-vec extension, which provides vtab-based vector indexing with HNSW (Hierarchical Navigable Small World) approximate nearest-neighbor search. HNSW achieves O(log n) query complexity versus O(n) for brute-force, making it viable at millions of vectors without degradation.

The extension is written in C and compiled into the SQLite process — no socket overhead, no marshalling cost. For most agent workloads (sub-1M vectors), performance is effectively O(1) with HNSW indexing in place.

40 What is "Associative Pathfinding"? +

The ability to traverse graph edges to find non-obvious connections. If memory node A has a semantic edge to B, and B has a causal edge to C, VEKTOR's graph() method finds C even if the original query only mentioned A.

This is the core capability that separates associative memory from similarity search. Set hops: 2 for two-hop traversal — useful for uncovering second-order relationships that pure vector search would miss entirely.

41 Why Node.js instead of Python? +

Node.js is superior for real-time I/O and state management in production agent environments. Non-blocking I/O means the memory layer never blocks the agent's main execution thread. The event loop architecture aligns naturally with the asynchronous, interrupt-driven nature of agent tool calls.

Python parity is provided for data scientists and ML workflows where Python is mandatory — but the core SDK is built for the production deployment reality of web-scale agents, where Node.js is the dominant runtime.

42 Can I run multiple agents on one DB? +

Yes, two modes. Use the agentId namespace to isolate memories per agent — each agent sees only its own data. Or set the shared: true flag on specific memories to enable federated swarm intelligence — multiple agents can read and write to a shared memory pool.

The shared pool is useful for multi-agent systems where agents need to coordinate, share discoveries, or maintain a collective world model.

43 How does the "Morning Briefing" work? +

memory.briefing() queries the rem_log and mem_cells tables to generate a human-readable summary of what the agent learned, updated, and consolidated since the last session or since a specified timestamp.

Typical output includes: new nodes added, contradictions resolved, REM compression stats, and any emergent connections discovered during the last dream cycle. Useful as a daily context-setter injected into the system prompt at session start.

44 Is the graph traversable by the agent itself? +

Yes. The vektor_graph MCP tool (Slipstream) allows Claude and other agents to query raw nodes and edges for their own reasoning. An agent can ask "show me the 2-hop neighbourhood of the TypeScript preference node" and receive a structured graph fragment it can reason about directly.

This enables meta-cognitive behaviour — the agent can reflect on the structure of its own memory, not just query for content.

45 What is the "Truth Audit"? +

The audn_log table provides a 100% accurate, append-only record of every time a memory was added, updated, or deleted — including the semantic reasoning that triggered the AUDN decision.

You can query it to answer: "when did the agent's understanding of X change?", "what caused this memory to be deleted?", and "how many contradictions were resolved in the last 30 days?" Essential for debugging and compliance in production deployments.

46 Can I host this on a Raspberry Pi? +

Yes. VEKTOR is extremely lightweight — SQLite, a 25MB embedding model, and a Node.js process. If the hardware runs Node.js v18+, it runs VEKTOR. Tested on Raspberry Pi 4 (4GB) with acceptable performance for single-agent workloads at modest query volumes.

For high-frequency production deployments, a standard VPS (2 vCPU, 4GB RAM) is more comfortable, but edge deployment is entirely viable.

47 Does it support multi-modal data? +

Currently VEKTOR supports text and JSON payloads natively. Image metadata (file paths, EXIF data, descriptive captions) can be stored as JSON — giving agents persistent memory about visual assets without storing raw binary data.

Native multi-modal embedding support (vision models, audio transcripts) is on the roadmap. Follow the Medium for release announcements.

48 How do I migrate from a flat JSON memory? +

Pipe your JSON logs into the memory.remember() method in a batch loop. The AUDN loop will automatically organise the input — deduplicating, resolving contradictions, and building the initial graph structure from your existing data.

For large migrations (>10,000 entries), process in batches of 100–500 with brief pauses to avoid embedding queue saturation. The AUDN loop is idempotent — safe to re-run on data that's already been ingested.

49 What is the "Sovereignty Guarantee"? +

A commitment to data sovereignty — not pricing sovereignty. Your memory graph is always local, always yours, and never accessible to us regardless of subscription status. Cancel, pause, or leave at any time: your SQLite file goes with you, intact and fully readable.

VEKTOR commits to never locking your memory data behind a paywall, never using it for model training, and never requiring cloud storage. The core SDK starts at $9/month — cancel any time from your Polar portal.

50 How do I get started? +

Purchase your licence key, download the .tgz from the Downloads page, and install. The quickstart initialises in under 5 minutes (v1.5.0):

Quickstart

npm install -g ./vektor-slipstream-v1.7.7.tgz

Then initialise with your agentId, provider, and dbPath. Call memory.remember() on every agent turn. Call memory.recall() to inject context. That's the core loop — everything else (AUDN, REM, graph traversal) runs automatically.

Full examples for LangChain, OpenAI Agents SDK, and Claude MCP included in the package. Questions: [email protected]

51 How does spec-decoding retrieval work — what is bi-encoder + cross-encoder? +

VEKTOR uses a two-stage retrieval pipeline borrowed from production search systems:

Stage 1 — Bi-encoder shortlist: The query and all stored memories are encoded independently using bge-small-en-v1.5 (384 dimensions). HNSW index retrieves the top-20 candidates by cosine similarity in ~1ms. This is fast but approximate.

Stage 2 — Cross-encoder re-rank: All 20 candidates are jointly scored against the query using ms-marco-MiniLM-L-6-v2. Unlike cosine similarity, the cross-encoder sees the query and memory together, enabling it to score relevance from context rather than proximity alone. The top-k results from this pass are what memory.recall() returns.

The practical difference: pure vector search misses memories that are semantically distant but contextually crucial. The cross-encoder re-rank catches those. Cold recall — finding relevant memories the agent has never explicitly linked — improves significantly.

ARCHITECTURE → slipstream-embedder.js v1.2 · bge-small-en-v1.5 + ms-marco-MiniLM-L-6-v2

52 What is self-organising memory and how does Zettelkasten linking work? +

Every call to memory.remember() triggers a non-blocking background agent (vektor-selforg.js) that runs independently of the write path. It extracts keywords from the new memory, finds semantically related existing memories, and classifies the relationship type:

SUPPORTSEXTENDSCONTRASTSRELATEDPREREQUISITE

When enough related memories cluster together, the selforg agent synthesises a Zettelkasten context note — a permanent node that explains what the new memory means in the context of everything connected to it. This is stored as an ENTITY-layer node with typed causal edges back to its sources.

Because it runs asynchronously, remember() itself returns immediately — zero latency added to your agent loop. The graph wires itself in the background while your agent continues working.

MODULE → vektor-selforg.js · proxy in boot chain: dedup → selforg → rl-memory

53 What is query-anticipation indexing during session ingestion? +

When vektor-session-ingest.js processes a conversation, it doesn't just extract facts — it also predicts the queries those facts would answer. For each extracted memory, the LLM generates 3 likely query patterns (the potential field from HyperMem §3.2.3), and those queries are indexed via BM25 alongside the fact itself.

The practical effect: recall operates on the question space as well as the answer space. If your agent asks something it has never explicitly stored, the BM25 index can still surface memories that were ingested with a matching anticipated query — dramatically improving cold recall on novel questions.

Standard vector stores only index what was said. Query-anticipation indexing also indexes what could be asked. This is why VEKTOR recalls relevant context that pure embedding search consistently misses.

MODULE → vektor-session-ingest.js · HyperMem potential field §3.2.3

54 How does MAGMA Causal Diff handle merging two memory databases? +

Standard database merges are a union-and-dedup. MAGMA Causal Diff (vektor-magma-diff.js) does something fundamentally different — it reconstructs the causal lineage across both sources before merging.

When two SQLite memory graphs are merged, the diff engine:

Detects which memories evolved from which across the two sources (semantic lineage tracking)
Identifies contradictions — memories that assert conflicting facts
Resolves contradictions via configurable strategy: newest-wins, importance-wins, or llm-arbitration
Writes typed causal edges: EVOLVED_FROM, SUPERSEDES, CONTRADICTS with full provenance records

The result is a merged graph that preserves the history of how knowledge changed across two agents or two machines — not a flat union. Useful for teams sharing memory graphs, or agents syncing across environments.

MODULE → vektor-magma-diff.js · see also: import-merge docs

55a Does VEKTOR have a mobile app? +

Yes — VEKTOR Notes is a free Android app built on the Slipstream SDK. It includes JOT (note-taking with AI ghost text suggestions at 900ms), CHAT (Q&A against all your notes using BM25 + vector recall), GRAPH (live D3 force graph of memory connections), MEMORIES (auto-extracted entities and facts), and INBOX (flagged notes and AI-surfaced reconnections). Everything runs locally — no cloud, no ads. Learn more →

55b Has VEKTOR been benchmarked against other memory systems? +

Yes. VEKTOR Slipstream v1.7.7 scored 81% on the LongMemEval benchmark, using GPT-4o-mini as judge. LongMemEval tests 105 questions across multi-session conversations averaging 344 memory items per question — whether a system correctly surfaces relevant facts from earlier sessions when asked directly.

55c How many edges does the MAGMA graph support? +

There is no hard cap. In production, VEKTOR has been tested with over 22,496 temporal edges and 4,100+ memory nodes in a single SQLite database with recall well under 50ms. The temporal layer alone generates tens of thousands of edges, connecting every memory chronologically to what came before and after it.

09 PET — Privacy Enhancing Technology Ethos Q55 – Q70

55 Why do you say "mathematically enforce" privacy rather than "respect" it? +

Respect is a promise. Architecture is a proof. Any company can claim to respect your privacy — that claim is only as strong as their policy, their current leadership, and their next funding round.

VEKTOR enforces privacy structurally: your memory graph lives in a SQLite file on your machine. There is no pipeline, no endpoint, no background process capable of transmitting it to us. We can't see your data because we never built the mechanism to do so. That is a mathematical guarantee, not a policy one.

56 What does "agentic memory deserves the same sanctity as biological memory" mean in practice? +

Your agent holds your ideas, strategies, instincts, and decisions — the same cognitive material you carry in your own mind. Your most private thoughts are your most valuable currency. Your agent's memory graph is a synthetic extension of that.

In practice it means: no one reads your agent's memories without your explicit authorisation. Not us. Not a cloud provider. Not a model trainer. The graph is yours — private by architecture, not by policy.

57 Does VEKTOR collect any usage data at all? +

No. Default state is silence. VEKTOR has no telemetry pipeline, no anonymous usage tracking, and no background pings. The core engine is compiled and protected. All network behaviour, file access, and data flows are fully documented at vektormemory.com/docs.

VEKTOR SNAP (the Chrome extension) includes an optional, opt-in analytics toggle that sends only compression format and file size — no images, no URLs, no personal identifiers. It is off by default and requires explicit user activation. That is the full extent of any data transmission across the entire VEKTOR product suite.

58 Can my memory graph be used to train AI models? +

No. Your memory graph never leaves your machine, so it is structurally inaccessible for training purposes. VEKTOR cannot transmit, access, or licence your data — we have no mechanism to do so.

Note: LLM inference queries (recall synthesis, REM compression) are processed by your chosen provider — Claude, Gemini, OpenAI, Groq, Mistral, Ollama, or OpenRouter — under their respective privacy policies. VEKTOR does not control those providers. For fully air-gapped deployments with zero third-party exposure, use provider: 'ollama'.

59 What is the "Law of Zero Knowledge" in the VEKTOR Sovereign Agent manifesto? +

Law I of the Sovereign Agent framework — The Law of Zero Knowledge. Most AI infrastructure runs on an implicit bargain: capability in exchange for data. VEKTOR refuses this bargain.

Zero knowledge is not a privacy setting. It is the absence of the mechanism that would allow us to know anything about your usage, your data, or your agent's behaviour. We ship the logic. You keep the data.

60 How does Cloak enforce identity privacy separately from memory privacy? +

VEKTOR separates two distinct privacy domains: cognition (what your agent knows — the MAGMA graph) and identity (who your agent is — session tokens, API keys, OAuth credentials stored in the Cloak Vault).

These are isolated by architecture. A compromised memory backup cannot leak your credentials because they don't share a container. The Cloak Vault is AES-256-GCM encrypted and bound to your OS Keychain (macOS) or DPAPI (Windows) — physically locked to one machine and one user account. This is Tenet IV: The Law of Separation of Concerns.

61 What happens to my data if VEKTOR shuts down? +

Nothing. Your data is on your machine. It always was. VEKTOR shutting down has zero impact on your memory graph — it is a standard SQLite file readable with any SQLite tool, permanently, without any dependency on our servers or our existence as a company.

This is a core Privacy Ethos commitment — Permanent Ownership. Sovereignty means ownership that survives us. Cancel any time. Your SQLite file stays yours forever.

62 Is VEKTOR GDPR / CCPA compliant? +

VEKTOR's architecture makes most compliance questions moot: we don't collect, process, or store personal data on our infrastructure, so the majority of GDPR and CCPA obligations — data subject access requests, right to erasure, data processor agreements — simply don't apply to us as a data processor.

You are the data controller and the data processor. Your SQLite file is entirely within your jurisdiction. For enterprise deployments requiring formal DPA documentation, contact us at [email protected].

How we apply Privacy Enhancing Technology (PET) to get there:

Local-first storage. Memory lives in a SQLite file on your own disk by default — not a hosted database we operate. There's no server-side copy for us to be subpoenaed for, breached, or required to disclose under GDPR Art. 28 processor rules.
Data minimization by design. We ingest only what you explicitly store; there's no background telemetry harvesting conversation content, so there's nothing extra to map to GDPR's "purpose limitation" principle.
Right to erasure, trivially satisfied. Because the memory store is a local file you control, deletion is a file operation — not a ticket into a vendor's backend, and not subject to backup-retention lag.
Cryptographic access control. Where memory is synced or shared (e.g. Vek-Sync, team deployments), it's encrypted client-side first, so VEKTOR infrastructure never handles plaintext personal data in transit.
No sale of data, ever. Satisfies CCPA's "right to opt out of sale" by construction — there's no data pipeline to opt out of.

63 Can I share my memory graph safely with a teammate? +

Yes — safely, because of the Separation of Concerns architecture. Your SQLite memory graph contains only cognition: preferences, facts, decisions, history. It contains no credentials, no session tokens, no API keys. Those live exclusively in the Cloak Vault, which is machine-bound and cannot be exported.

Copy the .db file to a teammate's machine and point VEKTOR at it — they get the full knowledge graph with zero risk of credential exposure. This is the design intent of Tenet IV: The Law of Separation of Concerns.

64 How do I delete my agent's memory permanently? +

Delete the .db file. That is the entire memory graph. There is no cloud backup to chase, no server-side copy to request removal of, no support ticket required. One file, one delete, done.

For surgical deletion — removing specific memories while preserving the graph — use memory.forget(nodeId) which routes through the AUDN DELETE path and logs the removal to the audn_log audit table.

65 What is the "Sovereignty Guarantee" and is it legally binding? +

The Sovereignty Guarantee is a data commitment, not a pricing lock-in. It means your memory graph is mathematically local — we have no server-side copy, no access, and no claim over it. It is encoded into the commercial licence.

Whether you are subscribed or not, your SQLite file is yours, readable, and portable. We commit to: (1) never moving local memory to mandatory cloud storage, (2) never using your memory data for model training, (3) never increasing prices without 30 days’ notice, (4) always providing a data export path. Core SDK from $9/month — cancel any time from your Polar portal.

66 Where can I read the 8 Sovereign Laws? +

The 8 Sovereign Laws are published in full in the FAQ section below (S10) and in TENETS.md included with your Slipstream package. Laws I–IV are architectural — verifiable by inspecting the source. Laws V–VIII are design commitments that shaped every engineering decision.

67 How do I open the memory graph dashboard?

Type /graph, /dash, or /dashboard in chat. The graph server starts automatically on boot on port 3847 — the dashboard opens instantly in your browser. No extra setup required.

68 What does the graph dashboard show?

Three tabs: Graph — a live D3 force simulation of your memory nodes, filterable by semantic, causal, temporal, and entity type. Health — node counts, DB path, uptime, provider, and model. Settings — the full config GUI. Three themes: Dark (default), Mid (warm parchment), Light — toggle via the VEKTOR logo in the sidebar.

69 Do I need to edit config.json to set up providers and API keys?

No — from v1.5.0 the Settings tab in the dashboard lets you pick your active model, set API keys for all providers, and export a complete .env file, all from the browser. Changes hot-reload into the CLI within 2 seconds — no restart needed. You can still edit config.json directly or use vektor config set <key> <value> if you prefer.

70 Do I need a paid API key to use web search?

No. From v1.5.0, Serper.dev is included in the search stack with a free tier of 2,500 queries/month. Get a free key at serper.dev and add it via the Settings GUI or vektor config set serper-api-key YOUR-KEY. DuckDuckGo requires no key at all. Full priority order: Brave → Serper → DuckDuckGo → SearXNG → Wikipedia.

10 The 8 Sovereign Laws I – VIII

Vektor Slipstream is built on a two-tier trust framework.
Laws I–IV are enforced by the code — verifiable by inspecting the source.
Laws V–VIII are design commitments — the principles that shaped every engineering decision.

— Tier 1: Architectural Laws — enforced by the code

I Law of Locality — data never leaves your machine +

All memory operations read and write to a local SQLite file on your machine. No outbound network calls are made for memory storage, retrieval, or embedding. LLM inference uses the provider you configure under your own API key — Claude, OpenAI, Gemini, Groq, Mistral, Ollama, or OpenRouter. For complete air-gap with zero third-party exposure, use provider: 'ollama'.

Verify it: grep -r "fetch\|axios" src/slipstream-core.js | grep -v "provider" — returns nothing. No cloud sync path exists in the codebase.

ENFORCED → createMemory({ dbPath }) — SQLite only

II Law of Epistemological Hygiene — AUDN curation on every write +

Every call to memory.remember() routes through the AUDN loop before writing. The system decides: ADD new info, UPDATE an existing node, DELETE a contradiction, or NO_OP if already known. There is no direct-write bypass. Duplicates and drift are structurally impossible.

Verify it: Every code path in remember() leads through audn() before any db.run(). Inspect slipstream-core.js — the write gate is unconditional.

ENFORCED → memory.remember(text) → AUDN → ADD | UPDATE | DELETE | NO_OP

III Law of Continual Synthesis — REM compresses, never injects +

The 7-phase REM cycle only removes or consolidates existing memories. It never adds new information from external sources. Compression is one-way: fragments collapse into insights. The graph grows wiser, not larger. All INSERTs in rem.js are rewrites of existing nodes — there is no external data source.

Verify it: grep -n "INSERT\|remember" src/rem.js — every INSERT rewrites an existing node. No external fetch path exists.

ENFORCED → vektor rem — read-compress-write only

IV Law of Object Permanence — memory survives all session resets +

The SQLite graph persists across process restarts, provider changes, and agent redeployments. A single .db file is the complete state of your agent's knowledge. Backup is a file copy. Migration is a file move. There is no in-memory-only state path — SQLite is the only store.

Verify it: Kill the process. Restart. Call memory.recall(). Memory is intact.

ENFORCED → dbPath: './agent.db' — portable, single-file, always persists

— Tier 2: Design Commitments — auditable, not runtime-enforced

V Law of Signal Purity — retrieval returns relevance, not recency +

Recall is ranked by combined importance score and cosine similarity — not insertion order or timestamp. A memory from six months ago surfaces if it is more relevant than one from yesterday. The ranking formula is (importance_score × cos_similarity) DESC, never ORDER BY created_at.

This is auditable in the recall() ranking logic. It shapes how AUDN importance scores are assigned and how REM synthesis weights nodes during compression.

DESIGN COMMITMENT → auditable in recall() ranking logic

VI Law of Separation of Concerns — memory and identity are separate systems +

What the agent knows (MAGMA graph) and who the agent is (Cloak Vault) are stored in physically separate, independently encrypted containers. Session tokens, API keys, and credentials never touch the memory graph file. Compromising one does not compromise the other.

The Cloak Vault is AES-256-GCM encrypted and bound to your OS Keychain. The MAGMA graph is a plain SQLite file. Different files, different keys, different access paths — separation enforced at the storage layer, not just the application layer.

DESIGN COMMITMENT → slipstream-memory.db ≠ vault.enc

VII Law of Harm Avoidance — no code path knowingly facilitates harm +

No feature in Vektor Slipstream is designed to enable surveillance, manipulation, or harm to the people whose data an agent processes. This is a design commitment, not runtime enforcement. Operators are accountable for what their agents do with persistent memory.

This principle is auditable in the codebase — no profiling, tracking, or behavioural manipulation primitives exist in the SDK. The agent does what the operator configures it to do. What it does, the operator authorised.

DESIGN COMMITMENT → operator accountability

VIII Law of Injection Resistance — prompt surface minimised by design, full enforcement opt-in +

AUDN prompt inputs are delimited before LLM evaluation. Recalled memories are clearly bounded in context to prevent bleed into system instructions. This is the default behaviour — no configuration required.

Full injection-shield screening is available as an opt-in flag for operators who require it. It adds approximately 80ms per write — a two-tier screen: pattern matching first (zero latency), then an LLM screen if patterns pass.

const memory = await createMemory({
  agentId: 'my-agent',
  licenceKey: process.env.VEKTOR_KEY,
  dbPath: './agent.db',
  sovereign: true  // enables Law VIII enforcement (~80ms/write)
});

DESIGN COMMITMENT + OPT-IN ENFORCEMENT → sovereign.js

11 Compatible Systems & Platforms Q71 – Q80

71 Which LLM providers does VEKTOR support out of the box? +

VEKTOR supports the following providers via a unified provider config key:

claudeopenaigeminigroqmistralollamaopenrouterlitellmlmstudionvidia-nimminimaxdeepseekxai / groktogether-aicohereperplexity

Switch provider without touching any other code — VEKTOR normalises the API surface. See docs/integrations →

72 Does VEKTOR work inside Cursor? +

Yes. Cursor supports MCP servers natively. Add VEKTOR to your ~/.cursor/mcp.json and all 49 VEKTOR tools become available inside every Cursor conversation — memory, recall, SSH, browser automation, and the full Cloak suite.

Config path

~/.cursor/mcp.json → add the vektor server block from your claude_desktop_config.json

73 Does it work inside Windsurf? +

Yes. Windsurf's Cascade agent supports MCP servers. Place the VEKTOR server config in ~/.codeium/windsurf/mcp_config.json. Once connected, Cascade can query persistent memory across sessions and use all Cloak browser tools.

74 Can I use VEKTOR inside VS Code? +

Yes via two routes: Continue extension (MCP server support, add VEKTOR to .continue/config.json) or Cline extension (MCP support, add to Cline's MCP settings panel). Both expose VEKTOR's full tool set inside the VS Code sidebar.

Native VS Code MCP support is also in preview for Copilot Chat — VEKTOR will work there once generally available.

75 What is the difference between using VEKTOR via MCP vs. the SDK directly? +

MCP mode — VEKTOR runs as a server process. The AI client (Claude Desktop, Cursor, etc.) discovers tools automatically and calls them by name. No code required. Best for conversational AI workflows.

SDK mode — You call createMemory() directly in Node.js. Full programmatic control, ideal for agents you build yourself, CI pipelines, or embedding VEKTOR into a product backend.

Both modes share the same SQLite database — you can mix them freely.

76 Does VEKTOR support open, closed, and local models? +

Yes — all three. VEKTOR normalises every provider behind the same provider config key, so switching between a closed frontier model and a local open-weight model is a one-line change, not a rewrite.

Closed / hosted frontier models — proprietary APIs you call with a key:

claudeopenaigeminixai / grokcohereperplexity

Open-weight models via hosted inference — open models served fast over an API:

groqmistraltogether-ainvidia-nimdeepseekminimaxopenrouter

Fully local / air-gapped — zero cloud calls, runs entirely on your machine:

ollamalmstudiolitellm

Groq specifically: set provider: 'groq' and supply your GROQ_API_KEY. Groq's low-latency GPU infrastructure (sub-100ms first-token) means recall synthesis is often faster than comparable OpenAI calls. Recommended models: llama-3.3-70b-versatile, openai/gpt-oss-20b (current default).

Ollama and LM Studio run fully air-gapped — no data ever leaves your machine, not even during LLM synthesis. OpenRouter gives access to 200+ models via a single key and can act as a fallback tier in VEKTOR's multi-LLM waterfall; LiteLLM lets you point VEKTOR at any proxy-routed model with zero code changes. See Q71 for the full provider-switch pattern, or docs/integrations → for per-provider setup guides.

77 How does OpenRouter integration work? +

Set provider: 'openrouter' and your OPENROUTER_API_KEY. OpenRouter gives VEKTOR access to 200+ models via a single key — useful for fallback chains, cost optimisation, or testing multiple models against the same memory graph.

VEKTOR's multi-LLM waterfall can chain OpenRouter as a fallback when primary providers hit rate limits.

78 Can VEKTOR use web search tools like DuckDuckGo? +

Yes. The Cloak suite includes vektor_search which uses DuckDuckGo by default (no API key required). Optional drop-in replacements: Brave Search (BRAVE_API_KEY), Serper (SERPER_API_KEY), or Tavily (TAVILY_API_KEY). Results are automatically stored as memories and can be recalled later — VEKTOR builds a persistent web research log.

79 Does VEKTOR work with Continue and Cline? +

Continue — add the VEKTOR MCP server block to .continue/config.json. VEKTOR tools appear in Continue's tool panel and are callable from any model Continue supports.

Cline — open Cline settings → MCP Servers → add server. Cline's autonomous agent mode can then use VEKTOR recall to maintain context across multi-step coding tasks and remember project-specific conventions between sessions.

80 Where can I find full integration guides for each platform? +

Full step-by-step guides with config snippets for every supported provider and MCP client are in the documentation:

→ docs/integrations — LangChain, OpenAI, Claude, Mistral, Groq, Gemini, Ollama, OpenRouter, Cursor, Windsurf, VS Code, Continue, Cline

12 Faraday-Gate Security & Multi-Agent Collab Q81 – Q85

81 What is Faraday-Gate? +

Faraday-Gate is a security proxy that sits between your agent and every other MCP server you run. It spawns and proxies your existing tool servers — file systems, databases, APIs, anything in your claude_desktop_config.json — and scans every tool schema and every call before anything reaches VEKTOR memory. No changes to how you already work.

Available now in v1.7.5 Preview, initialising automatically in MCP, graph GUI, and chat modes.

82 How does Faraday-Gate actually detect threats? +

Four layers. Static scan (L0) checks every tool name, description, and input schema against a signature library the moment a server connects — sleeper patterns and known injection signatures get blocked before the agent ever sees the tool. Tool pinning hashes each tool's schema with SHA-256 on first connect; if a server changes its tool definitions between sessions, that mismatch is flagged instantly as a rug-pull and the tool is blocked. Canary tokens are synthetic facts planted in memory at session start — if one shows up in an outbound call, an exfiltration attempt is in progress. Taint propagation tracks compromised data labels across the memory graph so contamination stays traceable.

Every intercept, gate event, and session boundary is written to a persistent local audit database.

83 Can I approve or deny risky actions before they run? +

Yes. Faraday-Gate includes a gate queue for high-risk actions — instead of blocking outright, it holds the action and waits for human review. Three MCP tools handle this: faraday_status returns the current session state including anything held in the gate queue, faraday_update_goal lets you declare your session's intent so Faraday-Gate can flag tool calls that drift from it, and faraday_approve_action approves or denies a held action using its gate_id. Nothing fires without sign-off.

84 What is the Collab Model Registry and how does multi-agent routing work? +

In a Collab or JOT session, a single LLM is rarely the right fit for every step. The model registry formalises multi-agent routing across four roles: conductor (DAG planning, requires structured output), thinker (deep reasoning), worker (execution), and verifier (fast pass/fail scoring).

Three session modes scale to what you have configured: full (frontier models available, up to a 12-node task graph, 4 parallel workers), lite (mid-tier models, 6 nodes, 2 workers), and solo (free-tier fallback, single agent). detectMode() picks the right mode automatically based on which provider keys you have set, and filterCandidates() matches the right model to each role's hard requirements — tier, context window, and structured-output support.

85 Can I set a different model for each provider? +

Yes. Model selection is configurable per-provider via model.{provider} keys in your vektor/config.json — set model.groq, model.claude, and so on, and every internal VEKTOR call using that provider (chat, synthesis, briefing generation, JOT collab, recall tuning) will use it. Key resolution order is config file, then environment variable, then the encrypted vault, then the provider default — so anything left unset just falls back to current behaviour.

All 16 providers are supported: Claude, OpenAI, Groq, Gemini, Mistral, DeepSeek, Together, Cohere, xAI (Grok), MiniMax, NVIDIA, Perplexity, LM Studio, LiteLLM, OpenRouter, and Ollama.

Every question.Answered precisely.

Every question.
Answered precisely.