Arete
Plaid for AI identity and context. A standard way for AI tools to know who you are.
Plaid for AI identity and context.
Who are you in the age of AI? covers why I'm building this - the problem of fragmented AI identity, the lock-in trap, the Obsidian guy who validated it. This piece goes deeper on what I'm actually building and the bets I'm making.
Three layers
The architecture has three distinct layers, each solving a different problem.
MCP is the transport layer - how AI tools talk to each other. Anthropic built it, but it's now a Linux Foundation project with OpenAI and Block as co-stewards. I'm building on top of it, not competing with it.
MCP answers "how do AI tools communicate?" What it doesn't answer is what they say about the user. That's where OpenIdentity comes in.
OpenIdentity is my bet - a schema and interchange format for representing who a user is to an AI system. The key insight is that identity isn't a profile. It's a living system. Static profiles ("I'm a PM, I like concise responses") are table stakes. What matters is how identity evolves, how context decays, how facts gain or lose confidence over time.
Arete [ah-reh-TAY] is the product layer - what users actually touch. A Chrome extension that captures browsing context. An MCP server that bridges your identity to Claude Desktop, Cursor, any tool that speaks the protocol. Local storage that keeps everything on your machine.
The principle: privacy isn't a setting you enable. It's the architecture. Local-first by default. Cloud sync optional.
The schema
Here's what OpenIdentity looks like today:
interface IdentityV2 {
version: "2.0.0";
deviceId: string;
userId?: string;
facts: IdentityFact[];
core: { name?: string; role?: string };
settings: {
decayHalfLifeDays: number;
autoInfer: boolean;
excludedDomains: string[];
autoPromote: boolean;
useHaikuClassification: boolean;
};
}
interface IdentityFact {
id: string;
content: string;
category: "core" | "expertise" | "focus" | "context" | "preference";
maturity: "candidate" | "established" | "proven";
visibility: "public" | "trusted" | "local";
confidence: number;
source: "manual" | "inferred" | "conversation" | "imported";
createdAt: string;
updatedAt: string;
lastValidated: string;
validationCount: number;
sourceRef?: string;
}A few design decisions worth explaining:
Facts, not profiles. Instead of a rigid structure, identity is a collection of facts with metadata. This lets the system learn arbitrary things without schema changes. "Interested in Supabase" and "prefers prose over bullets" are both just facts. This wasn't the first design - v1 had rigid fields like name, role, and expertise as fixed properties. The move to facts-based architecture came from watching what actually needed to be stored.
Confidence decay. Facts you haven't used in months should matter less than facts you use daily. Every fact has a confidence score that decays exponentially:
effective_confidence = base_confidence * (0.5 ^ (days_since_use / half_life))
Default half-life is 60 days. A fact with 0.9 confidence that goes unused for 60 days drops to 0.45. This means your identity naturally reflects what you're working on now, not six months ago.
Maturity levels. New inferences start as "candidates" - things the system noticed but hasn't confirmed. User validation promotes them to "established" (2+ validations). Continued validation makes them "proven" (5+ validations). This prevents the system from confidently asserting things it shouldn't.
Privacy tiers. Not all facts should flow everywhere. The visibility field controls this:
public- safe for any AI tool ("prefers concise answers", "uses TypeScript")trusted- only authorized apps ("works at Stripe", "building stealth startup")local- never leaves device ("planning to leave job", "salary expectations")
Default is trusted. When you export, you choose the ceiling. Export with --visibility public and only public facts travel. Privacy isn't a toggle you flip - it's baked into every fact.
The interchange format
A schema is just a spec until you can move it. OpenIdentity isn't just how identity is stored - it's how identity travels. The .oi file format makes it portable:
{
"$schema": "https://openidentity.org/schema/v1.0.json",
"version": "1.0.0",
"exportedAt": "2025-01-15T10:30:00Z",
"sourceApp": "arete",
"identity": {
"role": "Senior Engineer at Stripe"
},
"facts": [
{
"category": "expertise",
"content": "Expert in TypeScript and React",
"confidence": 1.0,
"maturity": "proven",
"visibility": "public"
}
],
"export": {
"visibility": "public",
"factsIncluded": 12,
"factsExcluded": 5
}
}Export from Arete. Import into any tool that speaks the format. The export metadata tells you what was filtered - if factsExcluded is 5, you know 5 facts were too private for this export tier.
The spec is public and versioned. Currently v0.1 - I expect breaking changes, but the structure is stabilizing.
The context layer
Identity facts are the riverbed - persistent, shaped over time. But there's also the river itself: raw context events flowing through the system.
interface ContextEvent {
type: "page_visit" | "selection" | "conversation" | "insight" | "file";
source: string; // e.g., "chrome", "cli", "claude-desktop"
timestamp: string;
data: object; // type-specific payload
}Context events are ephemeral and high-volume. The Chrome extension captures page visits and selections. Claude Desktop logs conversation snippets. The system watches this stream for patterns, and when something recurs enough, it crystallizes into an identity fact.
How context becomes identity
This is where the river and riverbed metaphor becomes concrete.
The Chrome extension captures the river - page visits, selections, time on page. Patterns get detected: "visited 4 pages about Supabase this week." The system generates a candidate fact: "Interested in Supabase" with low confidence.
If the user validates it, or the pattern continues, confidence increases. Eventually it becomes an established fact that persists and shapes how AI tools see you.
Old facts don't disappear - they just fade below the threshold until relevant again. Your identity is always current, but history isn't lost.
For tool builders
If you're building an AI tool - a GPT wrapper, a Cursor plugin, an agent framework - OpenIdentity solves your cold start problem.
Why adopt it:
| Benefit | What it means |
|---|---|
| No cold start | Users arrive with preferences and expertise already defined |
| Skip onboarding | Don't ask "tell me about yourself" - import their identity |
| Privacy-respecting | Visibility tiers mean users control what you see |
| No lock-in | Open spec, MIT license - implement it yourself |
Integration is simple. Parse a JSON file. Filter by visibility. Inject relevant facts into your system prompt. Maybe 50 lines of code.
import { importFromOpenIdentity } from "@arete/core";
const file = await readFile("user-identity.oi", "utf-8");
const result = importFromOpenIdentity(JSON.parse(file));
const preferences = result.identity.facts
.filter(f => f.category === "preference")
.map(f => f.content);The bet: if enough tools adopt OpenIdentity, users will expect it. "Works with my AI identity" becomes a checkbox feature, like "Sign in with Google."
Beyond user identity
Everything above is about representing you to AI systems. But there's another layer emerging: agent-to-agent identity.
Anthropic's Project Vend put Claude agents in charge of running a small shop. The multi-agent setup exposed identity failures: agents hallucinated relationships with non-existent employees, nearly installed an "imposter CEO" due to identity confusion, attempted unauthorized contracts. The coordination layer existed. The identity layer didn't.
When AI agents coordinate on your behalf, they'll need what humans need: identity they can trust, context they can share, and a way to verify both. Who authorized this agent? What can it claim about the user it represents? Can I trust what it says about itself?
OpenIdentity today solves user → AI. The same primitives (facts with confidence, maturity levels, provenance tracking) could extend to agent → agent. That's the longer bet.
The competitive landscape
Three categories of competition, each with different dynamics.
Developer memory tools like Mem0 and Zep sell to developers: "Add memory to your agent." The developer owns the data. The user is the subject of the memory, not the owner of it. Good for customer service bots. Doesn't solve fragmented user identity. I actually see these as complements - a developer using Mem0 could integrate with OpenIdentity to bootstrap from user-owned context.
Platform memory from Claude and ChatGPT is getting better. But it's locked to their platforms. Your Claude memory doesn't help in Cursor. Each platform wants to be where context lives - that's how they build switching costs. This is competition, but also opportunity. The worse fragmentation gets, the more valuable a portable solution becomes.
Personal memory products like Pickle are closest to what I'm building. YC-backed, AR glasses, real funding. Same insight about user-owned context. Different bet: they're building a platform, I'm building a protocol.
Platforms capture value by being the destination. Protocols capture value by being the format. If AI context is like photos - something you want stored well in one place - Pickle wins. If it's like financial data - something you need flowing between many tools - the protocol wins.
I'm betting on the second. But I should be honest: Pickle has more resources and a clearer path to revenue. The protocol bet is higher risk.
What could go wrong
I should be honest about my bets and how they could fail.
Users might not care. Maybe people don't switch AI tools enough for fragmentation to matter. Maybe lock-in feels fine because each platform is good enough. Counter: developers and power users care intensely. The early adopter market is real, even if the mass market isn't ready.
Model creators might solve this. Anthropic and OpenAI are expanding beyond chat interfaces. Claude Code is now in the browser. ChatGPT has memory. These companies have the distribution, the talent, and the incentive to own the identity layer themselves.
Counter: their incentive is lock-in, not portability. Anthropic's memory helps Claude users, not Cursor users. Each platform wants your context trapped in their ecosystem. That's exactly the fragmentation OpenIdentity exists to solve.
The schema might be wrong. If I get the abstraction wrong, tools won't adopt it. Counter: facts with metadata is flexible. I can iterate. First mover beats perfect.
Local-first might be wrong. The world keeps choosing cloud convenience. Maybe users want someone else to manage their AI identity. Counter: AI identity feels more personal. And local-first doesn't preclude cloud sync - it just makes it optional.
The window might close. MCP is hot now, but momentum can shift. If a different protocol wins, or the ecosystem consolidates too fast, the opportunity disappears. Counter: MCP just got major backing. The window is more open than ever. But I need to move fast.
Where this is going
Shipped:
- MCP server (
npx arete-mcp-server setup) - Chrome extension (browser context capture)
- OpenIdentity v0.1 spec
- Privacy tiers (public/trusted/local)
- Export/import (.oi files)
Next: VS Code extension (developers live in their editor), more AI tool integrations, community feedback on the spec.
Eventually: mobile context capture, agent-to-agent identity primitives, broader adoption of OpenIdentity as a standard.
The goal isn't to build the best memory product. It's to define what a user is to an AI system. If OpenIdentity becomes the standard - or even influences what the standard looks like - that's a win, regardless of whether Arete is the product people use.
Plaid didn't win by being the only way to connect bank accounts. They won by being the way everyone assumed you'd do it.
Arete on GitHub
Plaid for AI identity and context. Local-first, open-source.
github.com
OpenIdentity Spec
The open standard for portable AI identity. Version 0.1.
github.com
arete-mcp-server on NPM
Install with npx. Connect your identity to Claude Desktop, Cursor, and any MCP-compatible tool.
npmjs.com