Overview
Go beyond memory to agents with actual social intelligence
When building agents developers often run into the same walls:
“My agent forgets everything between chats”
You need memory: session management, message storage, context handling. It’s table stakes, but surprisingly complex to get right.
“My agent treats everyone exactly the same”
You need personalization: user modeling, preference learning, behavioral adaptation. Now you’re building a social cognition engine.
“I’m writing infrastructure instead of features”
You need Honcho

Honcho delivers production-ready memory infrastructure from day one. Store conversations, manage sessions, get perfectly formatted context for any LLM. But here’s the magic: while your agents are chatting, Honcho is learning. It builds Theory of Mind models automatically, transforming raw conversations into rich psychological understanding.
Your agents evolve from goldfish to counselor, on the same infrastructure. That’s Honcho.Designed for developers and agents alike:
Natural Language Queries: Chat with Honcho in natural language via the Dialectic API and let agents backchannel
Automatic Context Management: Smart summarization that respects token limits
Native multi-agent support: Break out of User/Assistant Paradigms and build complex multi-agent systems
Agent-first interfaces: MCP connections and APIs designed for agents to consume and use as tools
Provider Agnostic: Works with any LLM or Agent Framework
How It Works
Storage
Developers use Honcho to store information about their users and application via two integrated layers:

Memory Layer: Captures all user interactions - messages, preferences, and behavioral patterns - in a peer-centric data model that scales from individual conversations to complex multi-agent scenarios. This also queues up messages for the reasoning layer to process.Reasoning Layer: Continuously analyzes stored interactions to build psychological profiles using theory of mind inference, extracting patterns about communication style, decision-making preferences, and mental models.
Retrieval
Once data is stored and generated within Honcho, the API exposes several different ways to retrieve and use those insights.
Dialectic API: This is the flagship endpoint that allows developers to send natural language queries to Honcho to chat with the representation of each user in your system to get dynamic, in-context actionable insights.Example Queries
“What’s the best way to explain technical concepts to this user?”
“Is this user more task-oriented or relationship-oriented?”
“What time of day is this user most engaged?”
“How does this user prefer to receive feedback?”
“What are this user’s core values based on our conversations?”
Get Context: This endpoint abstracts context window constraints and continuously retrieves the most relevant and recent data from a conversation. Provide a token budget and Honcho will return a combination of summaries and messages that provide session context. Use this for creating long-running conversations. We crafted our summaries to provide the most coverage of a session possible.
Search: This endpoint allows you to search across Honcho for relevant messages either at the workspace, peer, or session level. This endpoint uses a hybrid search strategy that combines text search and cosine similarity.
Working Representations: Get a cached, snapshot of a user in the context of a session. Instead of waiting for an LLM to synthesize an in-context response via the Dialectic endpoint, use this to get recent insights you can plug into your context window.
Ideal For
Personalized AI assistants that need to understand individual psychology, not just remember conversations.
Customer-facing agents that must adapt their approach based on user communication preferences and emotional context.
Multi-agent systems where AI needs to understand human collaborators’ working styles and decision-making patterns.
NPCs where you want autonomous agents with a rich and deep personality that isn’t the average sycophantic llm
Last updated