Vex — A Migration Tool for Vector Memory

Vector databases don't talk to each other. You pick Pinecone for a project, your client runs Qdrant, a new team member swears by Weaviate, and suddenly migrating a few million embeddings becomes a bespoke engineering problem every single time.

Vex is our answer to that. A zero-dependency CLI that moves vectors between any supported store using a single open interchange format: .vmig.jsonl.

The Problem with Vector Portability

Every vector store has a different API shape, a different concept of namespaces, a different way of attaching metadata. Pinecone has namespaces and sparse/dense index types. Qdrant has collections and payload filters. Weaviate has classes and schema objects. Chroma has collections with embedding functions baked in. They all store roughly the same thing — a float array with some metadata — but getting that array from A to B means writing a custom exporter for A and a custom importer for B, every time.

The deeper problem is that your vectors are your data. The text embeddings you generated over months, the memory stores your agents have built up, the knowledge graphs encoded in float space — that's not easily recreatable if you're locked into a store you want to leave.

The .vmig.jsonl format is deliberately simple: one JSON record per line, with a defined schema covering id, vector, text, dims, model, namespace, metadata, and a checksum-verified meta file alongside every export.

What Vex Does

Vex gives you four commands:

vex export --from pinecone --index my-index --out memories.vmig.jsonl
vex import --to qdrant   --collection agents --in memories.vmig.jsonl
vex migrate --from chroma --to weaviate --collection docs
vex validate memories.vmig.jsonl

export pulls all vectors from a source store into a local .vmig.jsonl file. import loads a .vmig.jsonl into a target store, auto-creating schemas and collections where the connector supports it. migrate chains them in one step with a local file as the intermediate. validate lints every record against the vmig spec and reports errors.

Connection config is passed as flags or environment variables. No config files required. No daemon to run.

The .vmig.jsonl Format

Each line in a .vmig.jsonl file is a self-contained vector record:

{
  "id": "mem_8f3a1c",
  "text": "The user prefers async communication and works in the EU timezone.",
  "vector": [0.021, -0.184, 0.093, ...],
  "dims": 1536,
  "model": "text-embedding-3-small",
  "namespace": "user-prefs",
  "metadata": { "source": "chat", "ts": "2026-04-28T14:32:00Z" },
  "created_at": "2026-04-28T14:32:01Z",
  "source_store": "pinecone",
  "modality": "text",
  "vex_version": "1.0.0"
}

Alongside every export, Vex writes a .vmig.meta.json file with record count, export timestamp, source store, and a SHA-256 checksum over all records. This lets you verify a file hasn't been corrupted or truncated before importing into a production store.

Connector Status

STORE	EXPORT	IMPORT	NOTES
vektor	✓	✓	Native — SQLite via better-sqlite3
jsonl	✓	✓	Plain file passthrough
pinecone	✓	✓	REST API, v1 index format
qdrant	✓	✓	REST API, auto-creates collection
chroma	✓	✓	HTTP client, collection auto-create
weaviate	✓	✓	Schema auto-create, batch import
pgvector	✓	✓	pg driver, table + ivfflat index auto-create

v0.2.0 — Weaviate, pgvector & Streaming

v0.2.0 adds two new connectors, streaming mode for large datasets, and re-embedding support.

Weaviate

The Weaviate connector uses the v4 HTTP API. On import, it inspects the target instance for an existing class matching your collection name. If it doesn't find one, it creates a schema with the correct vectorIndexConfig derived from the dims in your vmig file. Export uses GraphQL cursor pagination and normalises all scalar properties as metadata.

pgvector

The pgvector connector connects via the pg Node driver — no ORM, no extra deps. On import it runs CREATE TABLE IF NOT EXISTS with a vector(N) column and a CREATE INDEX using ivfflat cosine distance. Export runs a full table scan. Both text and jsonb metadata columns are handled automatically.

Streaming mode

Datasets over 100k vectors automatically switch to streaming mode — records are piped through line-by-line without ever loading the full dataset into memory. The threshold is configurable and the same vex migrate command handles both paths transparently.

--reembed & --embed-model

When a migration encounters a dims mismatch and the records contain text, you can pass --reembed --embed-model text-embedding-3-small to re-embed on the fly using OpenAI or a local Ollama endpoint. Records without text are flagged and skipped rather than silently corrupted.

Phase 4 — vex-adapter

The drift adapter is now live as a separate package. @vektormemory/vex-adapter translates vectors between embedding model spaces using pre-trained linear projection weights — no API calls, no re-embedding, pure matrix math.

New — separate package

@vektormemory/vex-adapter

Translate embeddings between model spaces in milliseconds. A learned linear projection W maps v_source → v_target with L2 normalisation. Works on datasets of any size via streaming. Train custom projections from aligned pairs with vex-adapt train.

bge-small → text-embedding-3-small bge-base → text-embedding-3-small bge-large → text-embedding-3-large all-MiniLM-L6 → text-embedding-3-small all-mpnet-base → text-embedding-3-small ada-002 → text-embedding-3-small e5-base → text-embedding-3-small

npm install -g @vektormemory/vex-adapter

vex-adapt --from bge-small-en-v1.5 --to text-embedding-3-small memories.vmig.jsonl
vex-adapt train --from my-model --to text-embedding-3-small --pairs aligned.jsonl
vex-adapt list

Installation

# Core migration tool
npm install -g @vektormemory/vex

# Drift adapter (vec2vec, no re-embedding)
npm install -g @vektormemory/vex-adapter

Node.js 18+ required. Zero runtime dependencies. Apache 2.0 license.

Try Vex

Open source. Zero dependencies. Works with any vector store.

npm install GitHub Product Page