Memory System
Hermes Agent has bounded, curated memory that persists across sessions. This lets it remember your preferences, your projects, your environment, and things it has learned.
How It Works
Section titled “How It Works”Two files make up the agent’s memory:
| File | Purpose | Char Limit |
|---|---|---|
| MEMORY.md | Agent’s personal notes — environment facts, conventions, things learned | 2,200 chars (~800 tokens) |
| USER.md | User profile — your preferences, communication style, expectations | 1,375 chars (~500 tokens) |
Both are stored in ~/.hermes/memories/ and are injected into the system prompt as a frozen snapshot at session start. The agent manages its own memory via the memory tool — it can add, replace, or remove entries.
How Memory Appears in the System Prompt
Section titled “How Memory Appears in the System Prompt”At the start of every session, memory entries are loaded from disk and rendered into the system prompt as a frozen block:
══════════════════════════════════════════════MEMORY (your personal notes) [67% — 1,474/2,200 chars]══════════════════════════════════════════════
User's project is a Rust web service at ~/code/myapi using Axum + SQLx§This machine runs Ubuntu 22.04, has Docker and Podman installed§User prefers concise responses, dislikes verbose explanationsThe format includes:
- A header showing which store (MEMORY or USER PROFILE)
- Usage percentage and character counts so the agent knows capacity
- Individual entries separated by
§(section sign) delimiters - Entries can be multiline
Frozen snapshot pattern: The system prompt injection is captured once at session start and never changes mid-session. This preserves the LLM’s prefix cache for performance. When the agent adds/removes memory entries during a session, the changes are persisted to disk immediately but won’t appear in the system prompt until the next session starts.
Memory Tool Actions
Section titled “Memory Tool Actions”The agent uses the memory tool with these actions:
- add — Add a new memory entry
- replace — Replace an existing entry with updated content (uses substring matching via
old_text) - remove — Remove an entry that’s no longer relevant (uses substring matching via
old_text)
There is no read action — memory content is automatically injected into the system prompt at session start.
Substring Matching
Section titled “Substring Matching”The replace and remove actions use short unique substring matching — you don’t need the full entry text:
# If memory contains "User prefers dark mode in all editors"memory(action="replace", target="memory", old_text="dark mode", content="User prefers light mode in VS Code, dark mode in terminal")Two Targets Explained
Section titled “Two Targets Explained”memory — Agent’s Personal Notes
Section titled “memory — Agent’s Personal Notes”For information the agent needs to remember about the environment, workflows, and lessons learned:
- Environment facts (OS, tools, project structure)
- Project conventions and configuration
- Tool quirks and workarounds discovered
- Completed task diary entries
- Skills and techniques that worked
user — User Profile
Section titled “user — User Profile”For information about the user’s identity, preferences, and communication style:
- Name, role, timezone
- Communication preferences (concise vs detailed, format preferences)
- Pet peeves and things to avoid
- Workflow habits
- Technical skill level
What to Save vs Skip
Section titled “What to Save vs Skip”Save These (Proactively)
Section titled “Save These (Proactively)”The agent saves automatically — you don’t need to ask. It saves when it learns:
- User preferences: “I prefer TypeScript over JavaScript” → save to
user - Environment facts: “This server runs Debian 12 with PostgreSQL 16” → save to
memory - Corrections: “Don’t use
sudofor Docker commands, user is in docker group” → save tomemory - Conventions: “Project uses tabs, 120-char line width, Google-style docstrings” → save to
memory - Completed work: “Migrated database from MySQL to PostgreSQL on 2026-01-15” → save to
memory - Explicit requests: “Remember that my API key rotation happens monthly” → save to
memory
Skip These
Section titled “Skip These”- Trivial/obvious info: “User asked about Python” — too vague to be useful
- Easily re-discovered facts: “Python 3.12 supports f-string nesting” — can web search this
- Raw data dumps: Large code blocks, log files, data tables — too big for memory
- Session-specific ephemera: Temporary file paths, one-off debugging context
- Information already in context files: SOUL.md and AGENTS.md content
Capacity Management
Section titled “Capacity Management”Memory has strict character limits to keep system prompts bounded:
| Store | Limit | Typical entries |
|---|---|---|
| memory | 2,200 chars | 8-15 entries |
| user | 1,375 chars | 5-10 entries |
What Happens When Memory is Full
Section titled “What Happens When Memory is Full”When you try to add an entry that would exceed the limit, the tool returns an error. The agent should then:
- Read the current entries (shown in the error response)
- Identify entries that can be removed or consolidated
- Use
replaceto merge related entries into shorter versions - Then
addthe new entry
Best practice: When memory is above 80% capacity, consolidate entries before adding new ones.
Session Search
Section titled “Session Search”Beyond MEMORY.md and USER.md, the agent can search its past conversations using the session_search tool:
- All CLI and messaging sessions are stored in SQLite (
~/.hermes/state.db) with FTS5 full-text search - Search queries return actual messages from the DB — no LLM summarization, no truncation
- The agent can find things it discussed weeks ago, even if they’re not in its active memory
hermes sessions list # Browse past sessionssession_search vs memory
Section titled “session_search vs memory”| Feature | Persistent Memory | Session Search |
|---|---|---|
| Capacity | ~1,300 tokens total | Unlimited (all sessions) |
| Speed | Instant (in system prompt) | ~20ms FTS5 query |
| Cost | Token cost in every prompt | Free — no LLM calls |
| Use case | Key facts always available | Finding specific past conversations |
| Management | Manually curated by agent | Automatic — all sessions stored |
Configuration
Section titled “Configuration”# In ~/.hermes/config.yamlmemory: memory_enabled: true user_profile_enabled: true memory_char_limit: 2200 # ~800 tokens user_char_limit: 1375 # ~500 tokensExternal Memory Providers
Section titled “External Memory Providers”For deeper, persistent memory that goes beyond MEMORY.md and USER.md, Hermes ships with 8 external memory provider plugins — including Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover, and Supermemory.
External providers run alongside built-in memory (never replacing it) and add capabilities like knowledge graphs, semantic search, automatic fact extraction, and cross-session user modeling.
hermes memory setup # pick a provider and configure ithermes memory status # check what's activeSee the Memory Providers guide for full details.