ɳClaw: a personal AI with infinite memory

April 8, 2026Aric Camarata7 min read

nclawaiarchitecturememorylaunch

Every AI assistant forgets. You start a new chat, and the context from last week is gone. Some tools offer "memory" by stuffing old messages into the system prompt. That works until the context window fills up, and then it silently drops the oldest memories with no way to know what was lost.

ɳClaw takes a different approach. Memory is not a feature bolted onto a chat interface. It is the foundation the entire product is built on.

What ɳClaw is

ɳClaw is a personal AI assistant that runs on your own server. You self-host it using the ɳSelf CLI, which means your conversations, your memories, your data never leave your machine. It connects to any LLM provider you choose (OpenAI, Anthropic, local models via Ollama) through a multi-model router called Mux.

The key difference from other AI tools: ɳClaw remembers everything you have ever told it, and it organizes that knowledge automatically. There is no "New Chat" button. There is no context window limit on your history. You just talk to it, and over time it builds a complete picture of your life, your work, your preferences, your decisions.

Open the sidebar and you do not see a list of chat sessions. You see topics: Work > ProjectX > API Design, Personal > Health > Running Log, Finance > Taxes > 2026. ɳClaw created those topics by listening to your conversations. You never had to organize anything.

The four-layer memory architecture

ɳClaw's memory system has four layers, all backed by PostgreSQL. This is not a simplified diagram. This is the actual architecture.

Layer 1: Raw conversation store

Every message you send and receive is stored with full metadata: timestamp, topic assignment, thread ID, and vector embeddings. This is the audit trail. You can always go back and read the exact conversation, word for word, from six months ago or two years ago.

Messages are immutable once stored. ɳClaw never edits or summarizes your original words. The raw record is sacred.

Layer 2: Extracted facts

After every exchange, ɳClaw runs an extraction pass. It pulls out structured facts: decisions you made, preferences you stated, information you shared, deadlines you mentioned. Each fact gets:

A category: preference, decision, fact, deadline, contact, goal
A confidence score: how certain ɳClaw is about the extraction (0.0 to 1.0)
A source reference: link back to the exact message it came from
A timestamp: when you said it
An expiry hint: some facts are time-bound ("meeting is next Tuesday")

"I prefer dark mode in all my apps" becomes a preference fact with high confidence. "I think we should probably use Redis for this" becomes a tentative decision with lower confidence. The distinction matters when ɳClaw retrieves facts later. High-confidence facts get priority.

Over time, facts accumulate. After a year of daily use, you might have 10,000 extracted facts. ɳClaw can search, filter, and reason over all of them instantly.

Layer 3: Entity graph

People, projects, companies, tools, locations, and concepts mentioned in conversations become nodes in a relationship graph. Edges connect them: "Sarah works at Acme," "ProjectX uses Redis," "the London office handles EU compliance."

When you mention "the API rewrite we discussed with Sarah last month," ɳClaw traverses the graph. It finds Sarah (person node), finds conversations involving Sarah and API topics, pulls the relevant facts and decisions, and gives you a complete answer with context.

The graph is stored in PostgreSQL using JSONB adjacency lists with GIN indexes. Each entity has a type, a canonical name, aliases (so "Sarah," "Sarah Chen," and "the new PM" all resolve to the same node), and metadata. Relationships have types (works-at, uses, located-in, related-to) and weights based on how frequently they appear together.

This is not a toy graph. At scale, a power user might have 5,000 entity nodes and 20,000 edges. PostgreSQL handles this without breaking a sweat. The query "find all entities related to ProjectX within 2 hops" runs in under 10ms.

Layer 4: Topic tree

Every conversation is auto-classified into a topic hierarchy using PostgreSQL's ltree extension. The hierarchy is dynamic. ɳClaw creates new branches as new topics emerge and merges branches when it detects overlap.

The topic tree serves two purposes. First, it powers the sidebar navigation. Instead of scrolling through hundreds of "New Chat" entries, you browse by topic. Second, it scopes search. When you are talking about work, ɳClaw prioritizes memories from work-related topics. This dramatically improves relevance.

Topic classification happens in real-time. As you type, ɳClaw identifies the topic and either assigns the conversation to an existing branch or creates a new one. You can override or reclassify manually, but most users never need to. The auto-classification is accurate enough that the sidebar just works.

Tree queries in PostgreSQL are fast and indexed: ltree @> 'work.projectx' returns all subtopics of Work > ProjectX in milliseconds.

How retrieval works

When you ask ɳClaw a question, it does not just grep through old messages. It searches across all four layers simultaneously:

Semantic search via pgvector: finds conceptually similar content even if the exact words differ. "What was that database decision?" matches a conversation where you said "let's go with Postgres instead of MongoDB."
Full-text search via MeiliSearch: handles exact phrases, typo-tolerant queries, and faceted filters. Good for "find the message where I mentioned the $40K migration cost."
Graph traversal: follows entity relationships to find connected information. "What does Sarah think about the API?" triggers a traversal from Sarah through her connected conversations and extracted opinions.
Topic scoping: narrows results to the relevant topic branch first, then expands if needed. If you are in a work conversation, work memories come first.

The results from all four search paths are merged, deduplicated, scored by a combination of relevance and recency, and injected into the LLM's context window. The AI sees the most relevant memories for this specific question. Not a random sample. Not a summary. The actual relevant content.

For most queries, retrieval takes under 50ms. The user never notices a delay.

Why PostgreSQL for everything

We considered dedicated tools for each layer. Neo4j for the graph. Pinecone or Qdrant for vectors. Elasticsearch for full-text search. Each is excellent at its specific job.

We chose PostgreSQL for all of it. Here is why.

pgvector handles embeddings with HNSW indexing. For our scale (up to millions of vectors per user on a single instance), performance is within 5% of dedicated vector databases. The marginal improvement from Pinecone does not justify adding another service to self-host.

ltree handles the topic hierarchy natively. It is a first-class PostgreSQL extension with indexed tree operations. No external dependency needed.

JSONB stores flexible entity metadata and relationship attributes without schema migrations for every new field. The entity graph does not need a graph database when you have JSONB + GIN indexes.

MeiliSearch is the one exception. We use it alongside Postgres because its typo tolerance, faceted search, and instant results are meaningfully better than Postgres full-text search for the user-facing search . But MeiliSearch is a read replica of sorts. PostgreSQL remains the source of truth. If MeiliSearch goes down, ɳClaw falls back to Postgres full-text search and keeps working.

One primary database means one backup strategy, one replication stream, one point of operational complexity. For a self-hosted product, this matters enormously. Asking users to run PostgreSQL + Neo4j + Pinecone + Elasticsearch is a non-starter. Asking them to run PostgreSQL + MeiliSearch (both managed by nself start) is reasonable.

Multi-model routing with Mux

ɳClaw does not lock you into one AI provider. The Mux plugin handles model routing: it picks the best model for each task based on cost, speed, and capability.

A simple factual question gets routed to a fast, cheap model. A complex reasoning task goes to a frontier model. Code generation goes to a code-specialized model. You configure your available providers and API keys, and Mux handles the rest.

Currently supported: OpenAI (GPT-4o, GPT-4o-mini), Anthropic (Claude Opus, Sonnet, Haiku), Google (Gemini), and local models via Ollama. Adding a new provider is a configuration change, not a code change.

This means your AI assistant is not tied to any single company's pricing or availability. If OpenAI raises prices, switch your default to Anthropic. If you want maximum privacy, run everything through Ollama locally. ɳClaw does not care where the intelligence comes from. It cares about your memory.

What it looks like in practice

You wake up and tell ɳClaw about your day. "I have a meeting with Sarah at 2pm about the API migration. After that I need to review the Q3 budget draft."

ɳClaw remembers Sarah (entity node, linked to Acme Corp and ProjectX). It remembers the API migration (topic branch with 14 previous conversations). It remembers the Q3 budget (linked to Finance > Budget > 2026-Q3, with three extracted facts from previous discussions).

At 1:45pm, you ask: "Quick, what were the open issues from our last API discussion?" ɳClaw searches the topic branch, finds the three unresolved decisions from two weeks ago, and presents them with the original context. No scrolling through chat history. No searching your notes app. It just knows.

After the meeting, you tell ɳClaw: "We decided to go with the gradual migration approach. Sarah will own the first phase, targeting end of Q3." ɳClaw extracts a decision fact, creates a deadline entity for end of Q3, and updates Sarah's entity with the new responsibility. Tomorrow, next month, or next year, that decision is retrievable.

Pricing

ɳClaw is part of the ɳClaw bundle ($0.99/month or $9.99/year). That includes the AI plugin, the Claw plugin, Mux multi-model routing, voice input, browser integration, and more. You bring your own LLM API keys, so you pay the AI providers directly at their rates. No markup, no per-token fee from us.

The core ɳSelf stack is free. The AI plugins are in the ɳClaw bundle. Everything runs on your server.

Try it

brew install nself
nself init my-assistant
cd my-assistant
nself plugin install ai claw mux voice
nself build && nself start

Five minutes. Your own AI assistant with infinite memory. Your server, your data, your memories.

Check out the architecture docs for the technical deep-dive, or read about why self-hosted AI matters.

Get updates from the ɳSelf blog

Engineering posts, product updates, and technical guides. No spam.

Introducing ɳClaw Web, your AI assistant from any browser

ɳClaw now has a browser UI. Self-host the open-source web client, point it at your server, and get a full AI assistant interface without installing anything.

How to run three brands on one ɳSelf server

One VPS, one ɳSelf stack, three completely separate applications with isolated databases, auth, and storage. Here is the exact pattern.

Self-hosted AI: why it matters

Your emails, calendar, notes, and finances stay on your box. No vendor can rug-pull your assistant. The cost math works out better than you think.

All posts

ɳClaw: a personal AI with infinite memory

April 8, 2026Aric Camarata7 min read

nclawaiarchitecturememorylaunch

ɳClaw takes a different approach. Memory is not a feature bolted onto a chat interface. It is the foundation the entire product is built on.

What ɳClaw is

The four-layer memory architecture

ɳClaw's memory system has four layers, all backed by PostgreSQL. This is not a simplified diagram. This is the actual architecture.

Layer 1: Raw conversation store

Messages are immutable once stored. ɳClaw never edits or summarizes your original words. The raw record is sacred.

Layer 2: Extracted facts

After every exchange, ɳClaw runs an extraction pass. It pulls out structured facts: decisions you made, preferences you stated, information you shared, deadlines you mentioned. Each fact gets:

A category: preference, decision, fact, deadline, contact, goal
A confidence score: how certain ɳClaw is about the extraction (0.0 to 1.0)
A source reference: link back to the exact message it came from
A timestamp: when you said it
An expiry hint: some facts are time-bound ("meeting is next Tuesday")

Over time, facts accumulate. After a year of daily use, you might have 10,000 extracted facts. ɳClaw can search, filter, and reason over all of them instantly.

Layer 3: Entity graph

Layer 4: Topic tree

Tree queries in PostgreSQL are fast and indexed: ltree @> 'work.projectx' returns all subtopics of Work > ProjectX in milliseconds.

How retrieval works

When you ask ɳClaw a question, it does not just grep through old messages. It searches across all four layers simultaneously:

Semantic search via pgvector: finds conceptually similar content even if the exact words differ. "What was that database decision?" matches a conversation where you said "let's go with Postgres instead of MongoDB."
Full-text search via MeiliSearch: handles exact phrases, typo-tolerant queries, and faceted filters. Good for "find the message where I mentioned the $40K migration cost."
Graph traversal: follows entity relationships to find connected information. "What does Sarah think about the API?" triggers a traversal from Sarah through her connected conversations and extracted opinions.
Topic scoping: narrows results to the relevant topic branch first, then expands if needed. If you are in a work conversation, work memories come first.

For most queries, retrieval takes under 50ms. The user never notices a delay.

Why PostgreSQL for everything

We considered dedicated tools for each layer. Neo4j for the graph. Pinecone or Qdrant for vectors. Elasticsearch for full-text search. Each is excellent at its specific job.

We chose PostgreSQL for all of it. Here is why.

ltree handles the topic hierarchy natively. It is a first-class PostgreSQL extension with indexed tree operations. No external dependency needed.

JSONB stores flexible entity metadata and relationship attributes without schema migrations for every new field. The entity graph does not need a graph database when you have JSONB + GIN indexes.

Multi-model routing with Mux

ɳClaw does not lock you into one AI provider. The Mux plugin handles model routing: it picks the best model for each task based on cost, speed, and capability.

What it looks like in practice

You wake up and tell ɳClaw about your day. "I have a meeting with Sarah at 2pm about the API migration. After that I need to review the Q3 budget draft."

Pricing

The core ɳSelf stack is free. The AI plugins are in the ɳClaw bundle. Everything runs on your server.

Try it

brew install nself
nself init my-assistant
cd my-assistant
nself plugin install ai claw mux voice
nself build && nself start

Five minutes. Your own AI assistant with infinite memory. Your server, your data, your memories.

Check out the architecture docs for the technical deep-dive, or read about why self-hosted AI matters.

Get updates from the ɳSelf blog

Engineering posts, product updates, and technical guides. No spam.

ɳClaw: a personal AI with infinite memory

What ɳClaw is

The four-layer memory architecture

Layer 1: Raw conversation store

Layer 2: Extracted facts

Layer 3: Entity graph

Layer 4: Topic tree

How retrieval works

Why PostgreSQL for everything

Multi-model routing with Mux

What it looks like in practice

Pricing

Try it

Get updates from the ɳSelf blog

Related posts

Introducing ɳClaw Web, your AI assistant from any browser

How to run three brands on one ɳSelf server

Self-hosted AI: why it matters

ɳClaw: a personal AI with infinite memory

What ɳClaw is

The four-layer memory architecture

Layer 1: Raw conversation store

Layer 2: Extracted facts

Layer 3: Entity graph

Layer 4: Topic tree

How retrieval works

Why PostgreSQL for everything

Multi-model routing with Mux

What it looks like in practice

Pricing

Try it

Get updates from the ɳSelf blog

Related posts

Introducing ɳClaw Web, your AI assistant from any browser

How to run three brands on one ɳSelf server

Self-hosted AI: why it matters