Hey HN! I'm Fabio and I built UltraContext, a simple context API for AI agents with automatic versioning.
After two years building AI agents in production, I experienced firsthand how frustrating it is to manage context at scale. Storing messages, iterating system prompts, debugging behavior and multi-agent patterns—all while keeping track of everything without breaking anything. It was driving me insane.
So I built UltraContext. The mental model is git for context:
- Updates and deletes automatically create versions (history is never lost)
- Replay state at any point
The API is 5 methods:
uc.create() // new context (can fork from existing)
uc.append() // add message
uc.get() // retrieve by version, timestamp, or index
uc.update() // edit message → creates version
uc.delete() // remove message → creates version
Messages are schema-free. Store conversation history, tool calls, system prompts—whatever shape you need. Pass it straight to your LLM using any framework you'd like.
What it's for:
- Persisting conversation state across sessions
- Debugging agent behavior (rewind to decision point)
- Forking contexts to test different flows
- Audit trails without building audit infrastructure
- Multi-agent and sub-agent patterns
What it's NOT:
- Not a memory/RAG system (no semantic search)
- Not a vector database
- Not an Orchestration/LLM framework
UltraContext handles versioning, branching, history. You get time-travel with one line.
Docs: https://ultracontext.ai/docs
Early access: https://ultracontext.ai
Would love feedback! Especially from anyone who's rolled their own context engineering and can tell me what I'm missing.
I’ve been working on AI memory backends and context management myself and the core insight here — that context needs to be versionable and inspectable, not just a growing blob — is spot on.
Tried UltraContext in my project TruthKeeper and it clicked immediately. Being able to trace back why an agent “remembered” something wrong is a game changer for production debugging.
One thing I’d love to see: any thoughts on compression strategies for long-running agents? I’ve been experimenting with semantic compression to keep context windows manageable without losing critical information. Great work, will be following this closely.