AI features

RAG memory

Per-agent knowledge bases with Voyage AI, OpenAI, or local Ollama embeddings. Each agent's query script is locked to its own data — no multi-tenant leakage.

What RAG does

RAG (retrieval-augmented generation) gives an agent a searchable knowledge base. When the agent answers a question, it first searches its indexed documents and injects the most relevant chunks into its context. This lets a Claude-powered agent answer questions about things the base model has never seen — your contracts, internal docs, product manuals.

Architecture (per-server)

EXMER deploys a self-contained Python venv on each OpenClaw server:

~/.openclaw/rag/
    venv/                # isolated Python env with sqlite-vec + requests
    setup.py             # self-test: verifies imports and vec extension
    status.py            # per-agent status reader
    indexer.py           # chunk + embed + insert into SQLite

Install runs as a 7-step background task — detects OS, installs system packages via native pkg manager, creates fresh venv, upgrades pip + wheel + setuptools, installs latest sqlite-vec + requests, writes helper scripts, runs self-test to verify everything loads.

Per-agent isolation (important)

Each agent has its own private state at ~/.openclaw/agents/<id>/rag/:

config.json     # provider + model + API key (chmod 600)
knowledge/      # uploaded files (chmod 700)
index.db        # sqlite-vec store
query.py        # per-agent script with AGENT_ID BAKED IN

The query.py script is NOT global. It's generated per-agent when you configure RAG, and the AGENT_ID and RAG_DIR are hardcoded into the file contents. There is no --agent flag — the script can physically only read its own index.db and config.json.

Why this matters. If we had one global query.py --agent X, any agent could call query.py --agent other_agent via its bash tool and read another tenant's data. That's a multi-tenant security hole. The per-agent approach makes it physically impossible.

Embedding providers

Two categories in the UI:

Local (free): Ollama

Runs locally on the OpenClaw server. No API key, no cost, no data leaving the box. EXMER auto-installs Ollama via a 9-step background task:

Detect CPU architecture
Ensure zstd is installed (needed to extract Ollama tarball)
Check if Ollama is already installed (skip download if yes)
Download tarball directly from GitHub releases
Extract via unzstd | tar
Install binary to /usr/local/bin/ollama
Create ollama system user
Write our own systemd unit (not the upstream one — more reliable)
Start + verify via HTTP

Then a separate "Pull model" task downloads the embedding model (e.g. nomic-embed-text, mxbai-embed-large).

No curl | sh. The official Ollama install script is a black box that can hang and breaks when upstream changes asset names. We hit this in testing — that's why we do every step manually, each with its own progress indicator.

Cloud (paid): Voyage AI or OpenAI

No local install needed, just an API key.

Voyage AI — separate service from voyageai.com (NOT Anthropic). Anthropic uses Voyage internally for their own embedding needs. First 200M tokens free. Keys start with pa-.
OpenAI — text-embedding-3-small or text-embedding-3-large. Keys start with sk- but not sk-ant-.

Don't paste your Anthropic key into the Voyage field. EXMER will reject it at save time (both format check and live API test), but know the difference: Anthropic makes Claude (the LLM), Voyage makes embeddings (the vectors). They're separate products.

API key validation

When you save a config, EXMER does two checks before writing the file:

Format regex — catches obvious wrong-provider paste (sk-ant- in Voyage, pa- in OpenAI, etc.) with a helpful error.
Live API test — makes one tiny embeddings request against the provider. Catches revoked keys, wrong-account keys, model-not-available errors. 401/403 = reject with a clear message.

This prevents the silent-failure case where a bad key saves "successfully" and every query then fails.

Three UI states (honest status)

Instead of an ambiguous "configured" badge, EXMER shows three distinct states:

settings only (warning) — config exists but chunkCount === 0. Agent CANNOT use RAG. If the agent claims to have memory, it's hallucinating.
indexed (neutral) — chunks > 0 but SKILL.md not enabled. Agent has data but doesn't know how to reach it.
active (success) — chunks > 0 AND skill enabled. Agent can actually use RAG.

Only active means the agent can honestly say "I have RAG memory". Anything less is a hallucination risk.

Filesystem permissions

chmod 700 on rag/ directories, 600 on config.json. Note: this only helps if OpenClaw runs each agent under a different UNIX user. In a typical root install, all agents share UID 0 and the file permissions are moot. Fix this at the OpenClaw layer — EXMER warns but cannot force it.

AI Analysis is RAG-aware

When you run AI Analysis on a server with RAG installed, the backend injects a RAG_ARCHITECTURE_DOC into Claude's system prompt. It tells Claude what files exist, which read-only commands are safe, and what's strictly forbidden (no rm/mv on agent rag dirs, no direct index.db writes, never read config.json which holds the API key).