RAG memory
Per-agent knowledge bases with Voyage AI, OpenAI, or local Ollama embeddings. Each agent's query script is locked to its own data — no multi-tenant leakage.
What RAG does
RAG (retrieval-augmented generation) gives an agent a searchable knowledge base. When the agent answers a question, it first searches its indexed documents and injects the most relevant chunks into its context. This lets a Claude-powered agent answer questions about things the base model has never seen — your contracts, internal docs, product manuals.
Architecture (per-server)
EXMER deploys a self-contained Python venv on each OpenClaw server:
~/.openclaw/rag/
venv/ # isolated Python env with sqlite-vec + requests
setup.py # self-test: verifies imports and vec extension
status.py # per-agent status reader
indexer.py # chunk + embed + insert into SQLite
Install runs as a 7-step background task — detects OS, installs system packages via native pkg manager, creates fresh venv, upgrades pip + wheel + setuptools, installs latest sqlite-vec + requests, writes helper scripts, runs self-test to verify everything loads.
Per-agent isolation (important)
Each agent has its own private state at ~/.openclaw/agents/<id>/rag/:
config.json # provider + model + API key (chmod 600)
knowledge/ # uploaded files (chmod 700)
index.db # sqlite-vec store
query.py # per-agent script with AGENT_ID BAKED IN
The query.py script is NOT global. It's generated per-agent when you configure RAG, and the AGENT_ID and RAG_DIR are hardcoded into the file contents. There is no --agent flag — the script can physically only read its own index.db and config.json.
query.py --agent X, any agent could call query.py --agent other_agent via its bash tool and read another tenant's data. That's a multi-tenant security hole. The per-agent approach makes it physically impossible.
Embedding providers
Two categories in the UI:
Local (free): Ollama
Runs locally on the OpenClaw server. No API key, no cost, no data leaving the box. EXMER auto-installs Ollama via a 9-step background task:
- Detect CPU architecture
- Ensure
zstdis installed (needed to extract Ollama tarball) - Check if Ollama is already installed (skip download if yes)
- Download tarball directly from GitHub releases
- Extract via
unzstd | tar - Install binary to
/usr/local/bin/ollama - Create
ollamasystem user - Write our own systemd unit (not the upstream one — more reliable)
- Start + verify via HTTP
Then a separate "Pull model" task downloads the embedding model (e.g. nomic-embed-text, mxbai-embed-large).
Cloud (paid): Voyage AI or OpenAI
No local install needed, just an API key.
- Voyage AI — separate service from voyageai.com (NOT Anthropic). Anthropic uses Voyage internally for their own embedding needs. First 200M tokens free. Keys start with
pa-. - OpenAI —
text-embedding-3-smallortext-embedding-3-large. Keys start withsk-but notsk-ant-.
API key validation
When you save a config, EXMER does two checks before writing the file:
- Format regex — catches obvious wrong-provider paste (sk-ant- in Voyage, pa- in OpenAI, etc.) with a helpful error.
- Live API test — makes one tiny embeddings request against the provider. Catches revoked keys, wrong-account keys, model-not-available errors. 401/403 = reject with a clear message.
This prevents the silent-failure case where a bad key saves "successfully" and every query then fails.
Three UI states (honest status)
Instead of an ambiguous "configured" badge, EXMER shows three distinct states:
- settings only (warning) — config exists but
chunkCount === 0. Agent CANNOT use RAG. If the agent claims to have memory, it's hallucinating. - indexed (neutral) — chunks > 0 but SKILL.md not enabled. Agent has data but doesn't know how to reach it.
- active (success) — chunks > 0 AND skill enabled. Agent can actually use RAG.
Only active means the agent can honestly say "I have RAG memory". Anything less is a hallucination risk.
Filesystem permissions
chmod 700 on rag/ directories, 600 on config.json. Note: this only helps if OpenClaw runs each agent under a different UNIX user. In a typical root install, all agents share UID 0 and the file permissions are moot. Fix this at the OpenClaw layer — EXMER warns but cannot force it.
AI Analysis is RAG-aware
When you run AI Analysis on a server with RAG installed, the backend injects a RAG_ARCHITECTURE_DOC into Claude's system prompt. It tells Claude what files exist, which read-only commands are safe, and what's strictly forbidden (no rm/mv on agent rag dirs, no direct index.db writes, never read config.json which holds the API key).