The Operator’s Manual for Hermes Agent
Building an AI assistant that can act, remember, and improve
Operator’s Manual Β· Edition 3.2 Β· Verified against official Nous Research documentation
About This Manual
This manual explains how to deploy and operate Hermes Agent as a persistent “operator” β an AI system that runs continuously, uses tools, remembers context across sessions, and improves over time β rather than as a single-session chatbot. It covers architecture, installation, the core mental model, day-to-day workflows, the operator loop, common failure modes, advanced configuration (including offline skill optimization with GEPA), and a distilled set of operational lessons.
Hermes Agent moves quickly. It was first released publicly on 25 February
2026 and has shipped frequent updates since. Exact command flags, default
values, tool counts, and bundled-skill counts change between releases. This
edition was checked against the official documentation at
hermes-agent.nousresearch.com/docs, the GitHub repositories
(github.com/NousResearch/hermes-agent and
github.com/NousResearch/hermes-agent-self-evolution), and the project site.
When a specific number or flag matters, verify it against the current docs or
with --help.
How to use this manual
The manual is organized so that any topic can be found and re-read quickly:
- Chapters 1β3 are conceptual β architecture and the mental model. Read these once; they make every later feature predictable.
- Chapters 4β5 are operational β workflows and how to run Hermes continuously.
- Chapter 6 is a failure-mode index β scan it when something breaks.
- Chapters 7β8 are advanced configuration and distilled lessons.
- The Appendix is a pure command reference.
Some material deliberately appears in more than one place: a concept is explained once (in Chapters 1β3), then referenced where it is applied (Chapters 4β8) and listed for quick lookup (Chapter 6 and the Appendix). That is reference-manual design, not accidental duplication.
Corrections incorporated in this edition
Earlier informal guides to Hermes Agent contained inaccuracies. They are corrected here and listed so the differences are explicit:
- There is no “Hermes Vault” feature. Credential management is
hermes auth(credential pools with same-provider key rotation) plus the official1passwordskill. See Chapter 7. - There is no
/skillcommand. An installed skill is loaded by typing/<skill-name>directly./skillsis a separate command for searching, installing, and managing skills. - There are six terminal backends:
local,docker,ssh,daytona,singularity, andmodal. Daytona and Modal are the serverless options. - Built-in memory is file-based (
MEMORY.mdandUSER.md). It is one of three memory tiers; see Chapter 3. - Skills are auto-generated, not only hand-written. Hermes creates skills
from experience, refines them with the
skill_managetool, maintains them with a background Curator, and can optimize them offline with GEPA. hermes tools enable NAMEis not a documented subcommand. Toolsets are managed throughhermes tools,hermes setup tools, or the-t/--toolsetsflag.
Conventions
This manual addresses the reader directly (“you”, “your”). ~/.hermes/ is the
Hermes home directory; /home/user/... stands in for an absolute path that
should be replaced with a real one. Commands in code blocks are run from a
shell unless marked as in-session slash commands.
Introduction: The Operator Model
The common way to use a large language model is as a chat box: a prompt is pasted in, output is copied out. This works for lookups and drafts, but it has a structural ceiling. The model holds no context between conversations, the human carries all continuity, and every session starts cold. The limit is not prompt quality β it is the interaction model.
There is a second model. Instead of asking a model questions, you delegate work to an operator: a system that runs continuously, remembers what it has learned, uses tools to act, schedules its own follow-up work, and verifies its own results. The difference is the difference between looking up driving directions and employing an assistant who already knows the route, notices when conditions change, and adjusts the plan unprompted. One is a lookup. One is a delegation.
Hermes Agent is built for the second model. Its one-line pitch is an agent that gets better the longer you use it. What makes that real is that three capabilities usually found in separate tools sit in one framework: runtime skill learning, persistent multi-layer memory, and an optional offline optimization pipeline. The shift that matters is conceptual β once a task is set up, it runs, and the human’s role moves from executing to operating: supervising, verifying, and intervening only where a human decision is genuinely required.
Chapter 1: What Hermes Agent Is and How It Is Built
The pitch
Hermes Agent is an open-source, self-improving AI agent framework built by Nous Research, released in February 2026 under the MIT license. It runs on Linux, macOS, WSL2, native Windows (early beta), and Android via Termux. It connects to almost any LLM provider, exposes dozens of built-in tools, and β the property that distinguishes it β it learns: it creates reusable skills from experience, refines them as it uses them, remembers facts across sessions, and can search its own conversation history.
Hermes is not tied to a single machine. It can run on a low-cost VPS, a GPU server, or serverless infrastructure (Daytona, Modal) that hibernates when idle and costs almost nothing between tasks. It can be operated from a terminal or from any of 20+ messaging platforms.
Architecture
Understanding the structure makes every later feature predictable.
Everything flows through a single core agent class (an AIAgent in a
run_agent.py script). The CLI, the messaging gateway, IDE integration, the
batch runner, and an API server are all entry points into that same core β
which is what makes the platform-agnostic story work in practice.
Entry points Core agent Backends
ββββββββββββββββ βββββββββββββββββββββββ ββββββββββββββββββββ
β CLI ββββ β AIAgent ββββ¬ββ> β Session storage β
β Gateway ββββ€ β βββββββββββββββββ β β β (SQLite + FTS5) β
β IDE (ACP) ββββΌββββ> β β Prompt builderβ β β ββββββββββββββββββββ
β Batch runner ββββ€ β β Provider res. β β β ββββββββββββββββββββ
β API server ββββ β β Tool dispatch β β βββ> β Tool backends: β
ββββββββββββββββ β β Compression β β β terminal, web, β
β βββββββββββββββββ β β browser, file, β
βββββββββββββββββββββββ β MCP, vision, TTS β
ββββββββββββββββββββ
The core loop is ReAct-style and synchronous: build the system prompt, check whether context compression is needed, make an interruptible API call, execute any tool calls the model requested, and loop. Four details matter operationally:
- Six execution backends. The agent can run commands locally, in Docker, on a remote host over SSH, or in a Modal, Daytona, or Singularity sandbox β the same code, changed with one config setting. Execution can be moved from a laptop to a cloud server without touching anything else.
- Provider translation. A translation layer routes any provider through one of a small number of API formats, which is why the active model can be swapped β Claude, GPT, Gemini, a local Ollama model β with one command and nothing else breaking.
- A per-task turn cap (90 by default). Each task has a hard ceiling on the number of reasoning/tool turns. Without it, an agent stuck in a loop β retrying a failing API, re-reading the same file β would silently consume credits. Sub-agents spawned by delegation share the same budget, so a runaway delegation chain cannot bypass the cap. The ceiling is configurable in the setup wizard.
- Context compression. When a session approaches the model’s context-window limit, the loop compresses history automatically. Compression can also be triggered manually; see Chapter 5.
Where Hermes fits: the comparison with OpenClaw
Hermes is not primarily a coding copilot tied to an editor. Its closest peer in the open ecosystem is the personal-agent project OpenClaw. Both are persistent and messaging-friendly, but they make opposite architectural choices. A frequently quoted framing: Hermes packages a gateway around a learning agent; OpenClaw packages an agent around a messaging gateway.
| Dimension | OpenClaw | Hermes Agent |
|---|---|---|
| Architecture | Gateway-first; the agent is attached to the messaging layer | Agent-first; the gateway is one entry point into a learning runtime |
| Channel breadth | Very broad (50+ messaging channels) | Focused (20+ channels, the most-used ones) |
| Skill ecosystem | Very large community skill pool | ~120 skills bundled, plus the Skills Hub and GitHub taps |
| Learning loop | Skills are static | Skills self-evolve; the Curator prunes; GEPA optimizes offline |
| Memory | Plain markdown files | Three tiers: bounded markdown, FTS5 search, pluggable external providers |
| Security posture | Gateway-first design and a large unvetted skill pool have been associated with publicized incidents in 2026 | Snapshot-before-write for file operations and a curated skill set reduce some surface |
Treat the security row as point-in-time and directional, not as a current audit β verify present advisories for both projects before relying on either in a sensitive context (see Chapter 8). Setups can be migrated directly from OpenClaw; see Chapter 7.
Chapter 2: Installation and First Run
A working installation, configured and running a real task, takes roughly 30 minutes including troubleshooting. Requirements: Linux, macOS, or WSL2 (native Windows and Android/Termux are also supported); Python 3.11+, which the installer provides; and around 8 GB of RAM for ordinary API-based use.
Install
Linux, macOS, or WSL2:
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc # or ~/.zshrc
Native Windows (PowerShell, early beta):
iex (irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1)
Android (Termux): use the same curl one-liner as Linux; the installer
detects Termux automatically.
For an existing installation, hermes update pulls the latest version.
Automatic pre-update backups are off by default β add --backup for a
snapshot, or set update.backup: true in config.yaml. hermes update --check
reports whether a newer version exists without installing it.
Setup wizard
hermes setup
The first run launches an interactive wizard with five sections:
- Model β choose an LLM provider and enter an API key, or run an OAuth flow.
- Terminal β choose an execution backend (
local,docker,ssh,daytona,singularity,modal). - Gateway β optionally configure messaging platforms.
- Tools β enable or disable toolsets.
- Agent β behavior settings such as the per-task turn cap and compression thresholds.
Configuration is written to ~/.hermes/config.yaml; API keys are stored
separately in ~/.hermes/.env and never in config.yaml. Any single section
can be re-run later, e.g. hermes setup model.
hermes model runs the full provider-setup wizard from a shell (it can add a
provider or run OAuth). The in-session /model command switches only among
already-configured providers. Model identifier strings change as providers
release new models; treat any specific string as an example.
Health check
hermes doctor # checks dependencies, config, credentials, directories
hermes doctor --fix # attempts automatic repairs
hermes status # visual overview of agent, auth, and platform state
hermes dump # plain-text summary for a support request
Run hermes doctor before any significant work.
Two rules that prevent most early confusion. Toolset changes take effect
only in a new session β start one with /new. Configuration edits are read once
at startup and cached β after editing config.yaml directly, exit and relaunch
(or /restart the gateway). Prefer hermes config set <key> <value> or
hermes config edit over editing the YAML blindly.
First task
hermes chat -q "List the contents of ~/projects and tell me what kind of projects you see there"
This returns a directory listing and a short summary β a first real agent
action: it read the filesystem with its file tool, reasoned about the result,
and replied. To start an interactive session, run hermes chat (or just
hermes).
Expect early friction: first-time setup commonly includes an API key error or
provider timeout. Wait about 60 seconds and retry; if it persists, verify the
key in ~/.hermes/.env.
What lives in ~/.hermes/
Most day-to-day work touches one of these paths. Knowing the layout makes the rest of the manual concrete.
~/.hermes/
βββ config.yaml # main configuration (non-secret)
βββ .env # API keys, bot tokens, secrets
βββ auth.json # OAuth provider credentials
βββ SOUL.md # agent identity β slot #1 in the system prompt
β
βββ memories/
β βββ MEMORY.md # persistent agent facts (Tier 1 memory)
β βββ USER.md # the agent's model of you (Tier 1 memory)
β
βββ skills/ # all skills β bundled, hub-installed, agent-created
β βββ <category>/<skill-name>/SKILL.md
β βββ .archive/ # skills the Curator has archived (recoverable)
β βββ .hub/ # Skills Hub state
β
βββ sessions/ # per-platform session metadata
βββ state.db # SQLite session store, FTS5-indexed (Tier 2 memory)
βββ cron/
β βββ jobs.json # scheduled jobs
β βββ output/ # cron run outputs
β
βββ profiles/<name>/ # isolated profiles, each a full Hermes home
βββ plugins/ # custom plugins
βββ hooks/ # lifecycle hooks
βββ skins/ # CLI themes
βββ logs/ # agent.log, gateway.log, errors.log
Most of this is never edited by hand. The files worth knowing: config.yaml
(source of truth for everything non-secret), .env (secrets β Hermes routes
secret-looking values here automatically), SOUL.md (identity; Chapter 3),
skills/ (where the entire learning loop lives), and state.db (the
FTS5-indexed database that makes “what did we discuss three weeks ago?” work).
Chapter 3: The Mental Model
Hermes becomes predictable once a small set of concepts is understood. They are Tools, Identity, Skills, Context files, Memory, Sessions, Cron, and Gateway, tied together by the Learning Loop. A useful framing to hold throughout: memory is what the agent knows; skills are how it does things; identity is who it is.
Toolsets: what Hermes can do
A tool is a single capability (run a shell command, search the web). A toolset is a named group of related tools. Hermes ships dozens of tools β the exact count grows with releases β organized into toolsets. The capabilities most operators rely on:
| Capability | What it does |
|---|---|
| web | Web search and content extraction |
| terminal | Run shell commands, manage processes |
| file | Read, write, search, and patch files |
| browser | Browser automation (local Chrome over CDP, or a cloud browser) |
| code execution | Sandboxed Python via execute_code, including Programmatic Tool Calling |
| vision | Analyze images |
| image generation | Generate images |
| tts / voice | Text-to-speech and real-time voice |
| delegation | Spawn isolated sub-agents for parallel work |
| cron | Schedule tasks on a timeline |
| memory | Persistent cross-session facts |
| session search | Full-text search over past conversations |
| skills | Browse, install, and load skills |
| messaging | Send messages across platforms |
| kanban | Drive the multi-agent collaboration board |
| computer use | Drive a desktop GUI (macOS, via the cua-driver backend) |
Toolsets are configured through the interactive hermes tools UI, the
hermes setup tools section, or per run with -t/--toolsets (e.g.
hermes chat --toolsets web,terminal,skills). The authoritative list is in the
Toolsets Reference in the official docs. Enable only what a task needs β a
smaller tool schema produces cleaner agent behavior.
Identity: SOUL.md
Above memory and skills sits a layer that determines who the agent is when it shows up: identity. Without it, every agent feels like the same agent wearing different hats.
Identity is a single hand-authored file, ~/.hermes/SOUL.md. It occupies the
first slot in the system prompt, before anything else loads, and defines the
agent’s personality, tone, communication style, and hard limits.
# SOUL.md
You are a pragmatic senior engineer with strong taste.
You optimize for truth, clarity, and usefulness over politeness theater.
SOUL.md is static β written once, tweaked occasionally β and stays consistent
across every project and session. If the file is missing, Hermes falls back to a
built-in default identity (and now seeds a starter SOUL.md automatically).
Identity matters to the self-improving story because everything that follows β
the memory the agent writes, the skills it creates, the way it consolidates
knowledge β happens through the lens of this file. SOUL.md is the fixed frame;
memory and skills are the moving parts inside it. For a temporary change of
register without editing the file, /personality swaps in a built-in or custom
personality preset for the current session only.
Skills: procedural memory the agent writes itself
If memory holds facts, skills hold procedures β not what the agent knows but
how it does things. A skill is a markdown file with YAML frontmatter, stored
under ~/.hermes/skills/.
---
name: k8s-pod-debug
description: >
Activate for crashing pods, CrashLoopBackOff,
"why is my pod restarting", container failures.
version: 1.2.0
author: agent
platforms: [linux, macos]
---
## Procedure
1. Get pod status, check events, pull logs
2. Look for OOMKilled, ImagePullBackOff, config errors
## Pitfalls
- Forgetting the --previous flag on restarted containers
## Verification
- Pod stays Running with 0 restarts for 5+ minutes
Progressive disclosure. To keep token cost low, skills load in three levels:
- Level 0 β the agent sees only names and descriptions (roughly 100 tokens per skill; about 3k tokens for the full catalog). This is all that loads at session start.
- Level 1 β the full skill body, loaded on demand when a skill’s triggers match the task.
- Level 2 β specific reference files inside a skill, opened only when the agent needs that depth.
The result: the agent pays in tokens only for the skills it actually uses.
Self-evolution. This is the core differentiator. The agent creates its own
skills autonomously using the skill_manage tool. Creation triggers when the
agent completes a complex task (roughly five or more tool calls), recovers from
errors and finds the working path, is corrected by the operator, or discovers a
non-trivial workflow. The loop: encounter a problem, solve it through trial and
error, save the working approach as a SKILL.md file, and β next time a similar
problem appears β load that skill and follow the proven procedure instead of
rediscovering it. One-time discoveries become permanent procedural memory.
The skill_manage tool supports six actions: create, patch (a targeted
fix β preferred, because it is token-efficient), edit (a full rewrite),
delete, write_file, and remove_file.
Skills can also be written by hand and shared. Browse and install community
skills with hermes skills browse and hermes skills install; publish with
hermes skills publish; group several under one slash command with
hermes bundles. Any GitHub repository can be added as a custom tap:
hermes skills tap add yourname/your-skills-repo
hermes skills install yourname/your-skills-repo/<skill-name>
About 120 skills ship bundled; the Skills Hub (skills.sh / agentskills.io)
and community taps add many more. Counts grow with every release.
The Curator: maintenance for the skill library
Without maintenance, agent-created skills accumulate into dozens of narrow, overlapping playbooks that waste tokens. The Curator is a background maintenance system that prevents this.
It runs on an inactivity check, not a cron daemon: when at least 7 days have passed since its last run and the agent has been idle for 2 or more hours, a background fork of the agent spins up with its own prompt cache, never touching the active conversation. It operates in two phases:
- Automatic transitions (deterministic, no LLM): a skill unused for 30 days becomes stale; unused for 90 days, it is archived.
- LLM review (up to 8 iterations): the forked agent surveys all agent-created skills and decides, per skill, whether to keep, patch, consolidate, or archive.
Active ββ30d unusedββ> Stale ββ90d unusedββ> Archived ββ> Restored
β² (deterministic, no LLM) (moved to .archive/) (one command,
βββββββββββββββββ re-activated on use ββββββββββββββββββββ reversible)
Two constraints make the Curator safe: it never touches bundled or
hub-installed skills (only agent-authored ones), and it never auto-deletes β the
worst outcome is archival to ~/.hermes/skills/.archive/, recoverable with one
command. Before every pass it takes a tar.gz snapshot of the entire skills
directory, and rollbacks are themselves reversible. Critical skills can be
protected with hermes curator pin <skill>; patches and edits still apply to a
pinned skill, so the agent can improve it without it being unpinned first.
hermes curator run --dry-run previews a pass at any time.
Context files: AGENTS.md and friends
Distinct from skills (loaded on demand) are context files (loaded
automatically, every session). SOUL.md, covered above, is one. Hermes also
discovers and loads, from the current working directory:
AGENTS.mdβ a project’s standing rules. Placed in a project root, it holds architecture, conventions, and instructions (“FastAPI backend with SQLAlchemy”, “always use async for database operations”, “never commit.env”). Hermes loads the top-levelAGENTS.mdat session start; subdirectoryAGENTS.mdfiles are discovered lazily during tool calls..hermes.mdandCLAUDE.mdβ also recognized as project context files, so an existingCLAUDE.mdwritten for Claude Code is picked up without conversion..cursorrulesβ a.cursorrulesor.cursor/rules/*.mdcfile is read automatically, so existing coding conventions need not be duplicated.
The division: SOUL.md defines who Hermes is; AGENTS.md (and the others)
define a project’s rules; memory holds facts about you; skills hold
procedures. Keep context files concise β every character is injected into
every message and counts against the token budget. --ignore-rules skips all
context files, memory, and preloaded skills for a clean-room run.
Context references: pulling content in with @
Separate from the always-loaded context files, context references inject
content into a single message on demand. Typing @ followed by a reference β
a file path, a folder, a git diff, or a URL β expands it inline: Hermes appends
the referenced content to the message automatically. This is the precise tool
for “look at this” without restructuring a project’s AGENTS.md or pasting a
file by hand. Use context files for what should always be loaded; use @
references for what matters only to the message in front of you.
Memory: three tiers, three speeds
Hermes does not have a single memory. It has three layers, each for a different purpose. The agent picks the right one for the question.
Tier 1 β in-prompt memory. Two small markdown files in ~/.hermes/memories/:
MEMORY.md (the agent’s notes on your environment, conventions, tool quirks,
and lessons learned β about 2,200 characters) and USER.md (your profile: name,
communication preferences, skill level, things to avoid β about 1,375
characters). Both are injected into the system prompt as a frozen snapshot at
session start: a memory written mid-session persists to disk immediately but
does not appear in the prompt until the next session. When a file approaches
capacity (about 80%, shown as a percentage in the system-prompt header), the
agent consolidates β merging related entries into denser versions so only
useful information survives. You can assist: “clean up your memory”, “replace
the old Python 3.9 note β we are on 3.12 now”. Because Tier 1 is plain markdown,
it can be audited directly by opening the files. Speed: instant. Capacity: tiny.
Tier 2 β session search. Every conversation, from the CLI and from messaging
platforms, is stored in state.db (SQLite, FTS5-indexed). The agent can search
weeks of past conversations with the session_search tool, summarizing matches
with an LLM. Speed: on demand. Capacity: effectively unlimited. This is the same
store that powers session resume (see Sessions, below).
Tier 3 β external memory providers. For a deeper, persistent model of you,
Hermes ships pluggable providers configured with hermes memory setup. Only one
external provider is active at a time, and it runs alongside Tier 1 rather
than replacing it. Available providers, each with different storage, cost, and
dependency trade-offs, include Honcho (dialectic user modeling), OpenViking
(self-hosted, filesystem hierarchy), Mem0 (server-side LLM extraction),
Hindsight (knowledge graph), Holographic (local, no dependencies), RetainDB
(delta compression), ByteRover, and SuperMemory (context fencing). When an
external provider is active, Hermes prefetches relevant memories before each
turn, syncs conversation turns after each response, and extracts memories at
session end. hermes memory status shows current state.
The trade-off across tiers is the design: critical facts live in Tier 1, always in context but bounded; everything else is searchable on demand in Tier 2; Tier 3 adds a deeper model at the cost of an external dependency. Use Tier 1 for durable facts (“always run X from directory Y”), not for session-specific context (“working on X today”), which belongs in the session itself.
Sessions: the conversation thread
Each chat is a session β resumable, nameable, and searchable.
hermes --resume <id-or-title> # resume a specific session
hermes --continue # resume the most recent session
hermes --continue <name> # resume the most recent matching a title
Sessions are managed with hermes sessions (list, browse, rename,
prune, export, delete). Naming sessions with /title keeps them findable;
unnamed sessions become an indistinguishable pile within a week. Sessions
provide continuity and an audit trail, and β as Tier 2 above β are searchable.
Cron: scheduled execution
Cron runs tasks on a schedule. The gateway daemon ticks every 60 seconds, runs
any due jobs in isolated sessions, and delivers output to a messaging platform.
Jobs survive restarts; they live in ~/.hermes/cron/jobs.json and output goes
to ~/.hermes/cron/output/.
You do not have to write cron expressions β a job can be described in plain
English and Hermes converts it. The in-session /cron command also accepts
explicit forms:
/cron add 30m "Remind me to check the build" # one-shot, runs once in 30 minutes
/cron add "every 2h" "Check server status" # recurring interval
/cron add "0 9 * * 1-5" "..." # standard cron expression β weekdays 09:00
/cron add "every 1h" "Summarize new items" --skill blogwatcher # load a skill before running
From a shell, jobs are managed with hermes cron list / create / edit / pause / resume / run / remove / status. Jobs can be chained: one job’s output becomes
the next job’s input via a context_from flag β useful for multi-stage
automations (a research step feeding a writing step).
Delivery needs a destination. Run /sethome in a Telegram or Discord chat to
mark it as the home channel for proactive output; without one, the agent has
nowhere to send scheduled results. If a job reports success but no message
arrives β a failure mode observed with jobs producing structured output β verify
the home channel first, then consult the cron-troubleshooting guide; the
reliable fallback is to have the job’s script post to the platform API directly
(see Chapter 4). A job that should not fire until reviewed can be created and
immediately paused.
Gateway: Hermes in messaging platforms
The Gateway runs Hermes as a service inside messaging platforms β the CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email, SMS, Microsoft Teams, and others, 20+ in all.
hermes gateway setup # interactive platform configuration
hermes gateway run # run in the foreground (recommended for WSL, Docker, Termux)
hermes gateway install # install as a background service (standard Linux, macOS)
hermes gateway start / stop / restart / status / list
Per-platform setup differs: Telegram needs a bot token from @BotFather (and
your numeric user ID, available from @userinfobot, for the allowlist); Discord
needs a bot token with the Message Content Intent enabled; Slack uses an app
manifest (hermes slack manifest). Gateway activity is logged to
~/.hermes/logs/gateway.log.
WSL note. WSL’s systemd support is unreliable, so on WSL do not start the
gateway as a background service. Run it in the foreground inside tmux so it
survives the terminal closing: tmux new -s hermes 'hermes gateway run'.
The Learning Loop: how it fits together
The concepts above feed one another. The agent acts, captures what worked as skills, persists durable facts to memory, and the Curator keeps the skill set clean β so the agent is measurably more capable after months of use than on day one:
act ββ> notice a reusable pattern ββ> skill_manage creates/patches a skill
β² β
β βΌ
remember facts <ββ self-prompt to persist <ββ Curator prunes & consolidates
β
(GEPA optimizes offline β Ch. 7)
In one sentence: SOUL.md sets the identity, the runtime loop captures
experience, the Curator keeps the library clean, and GEPA makes sure what is in
the library actually works.
Summary of concepts
- Tools β what Hermes can do (capabilities)
- Identity β who Hermes is (
SOUL.md, slot #1 in the system prompt) - Skills β how Hermes does things (procedural memory, partly self-authored)
- Context files β a project’s standing rules (
AGENTS.md,.cursorrules) - Memory β what Hermes knows (three tiers: in-prompt, session search, external)
- Sessions β what was being worked on (history and continuity)
- Cron β what runs automatically (scheduling)
- Gateway β where Hermes can be reached (access)
- Learning Loop β how all of the above compound over time
Any new Hermes feature can be placed against this list, which makes the system predictable rather than a set of features to memorize.
Chapter 4: Core Workflows
Four representative workflows, with example prompts and expected behavior.
Workflow A: Research pipeline
Goal: turn an open question into a structured reference file.
Example prompt:
Research the current state of GGUF quantization for local LLM inference.
Find: (1) what tools support GGUF (2) performance benchmarks vs full precision
(3) any recent updates. Write a structured summary to
~/research/gguf-state-$(date +%Y%m%d).md with sources.
Hermes activates web tools, searches, reads full pages rather than snippets, synthesizes across them, writes a structured markdown file, and lists sources. Because the output is a file, the task can be started and left to complete.
Workflow B: Repository debugging
Goal: hand a broken repository to Hermes and receive a fix or a clear explanation of what must change.
Example prompt:
There's a test failure in ~/projects/kiln. Run pytest and tell me what's
failing and why. If it's a simple fix, apply it and rerun to confirm.
Report what the problem was and what you changed.
Hermes changes into the directory, runs the tests, reads the failing test and
relevant source, identifies the root cause, applies a fix, and re-runs to
confirm. For obvious defects a one-shot query suffices; for complex failures, an
interactive session lets you walk the stack trace with the agent. For risky
edits, add --checkpoints so files can be restored with /rollback (Chapter 5).
Workflow C: Scheduled daily briefing
Goal: have Hermes run a recurring task unattended and deliver the result.
Create a cron job with a schedule and a prompt (see Chapter 3 for the /cron add forms). For reliable delivery, the job can run a script that posts directly
to the platform API. The example below uses Telegram:
#!/bin/bash
# Daily content radar β posts a briefing to Telegram
TELEGRAM_BOT_TOKEN=$(grep TELEGRAM_BOT_TOKEN ~/.hermes/.env | cut -d= -f2)
TELEGRAM_CHAT_ID=$(grep TELEGRAM_CHAT_ID ~/.hermes/.env | cut -d= -f2)
BRIEFING=$(hermes -z "Run a content radar: find 3-5 interesting posts about AI
agents, local model setups, or Hermes workflows from the past 24 hours.
Format as a numbered briefing with links.")
curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
-d "chat_id=${TELEGRAM_CHAT_ID}" \
-d "text=${BRIEFING}" \
-d "parse_mode=Markdown"
Two details: hermes -z is the scripted one-shot entry point β a single prompt
in, the final answer out, nothing else on stdout β which suits cron, CI, and
parent scripts. And the script posts to the API directly: a direct call fails
loudly (a non-zero exit code) if it fails, whereas built-in delivery has been
observed to report success while the message did not arrive. Set up once, this
runs unattended on its schedule.
Workflow D: Multi-agent Kanban
Goal: split a large task across specialist agents, run them in parallel, and synthesize.
Hermes’s Kanban is a multi-profile collaboration board. A parent task holds the synthesis; child tasks hold parallel workstreams, each assigned to a specialist profile; children are linked to the parent so the parent runs only after they complete.
# Parent synthesis task
hermes kanban create "Write comprehensive X review" --assignee writer
# Child tasks β note --workspace dir:<absolute path> on every one
hermes kanban create "Research X market landscape" \
--assignee researcher --parent <parent-id> \
--workspace dir:/home/user/research/x-market
hermes kanban create "Technical analysis of X architecture" \
--assignee engineer --parent <parent-id> \
--workspace dir:/home/user/code/x-analysis
# Link children to the parent, then run one dispatcher pass
hermes kanban link <parent-id> <child-a-id>
hermes kanban link <parent-id> <child-b-id>
hermes kanban dispatch
Workspaces are critical. Every task that produces files to keep must use
--workspace dir:/absolute/path. The default scratch workspace is
garbage-collected when a task is archived; a dir: workspace persists.
Hermes can also decompose a task automatically: a coarse task placed in the
triage column is fanned out by hermes kanban decompose (on by default per
dispatcher tick) into child tasks routed to specialist profiles. Keep individual
tasks small enough to complete in a single agent run β a task needing hundreds
of turns will hit the turn cap, crash, and be re-queued.
Chapter 5: The Operator Loop
Running Hermes continuously requires an operational model. The principle: an operator sets up loops, verifies outputs, and intervenes only when a human decision is required β not one who watches the agent work.
Every operator session moves through four states:
Active ββ> Waiting ββ> Recovering ββ> Done ββ> (back to Active)
- Active β Hermes is working: running commands, searching, reading or writing files, generating output.
- Waiting β Hermes hit a blocking call (an LLM request, a tool waiting on an external resource, a rate limit). It resumes on its own.
- Recovering β something failed. Hermes retries, summarizes context, or restarts a step β or flags the operator and waits.
- Done β the task is complete, output delivered, the session saved.
A one-shot query passes through Active β Waiting β Done in seconds. A complex project cycles through these states over hours, bounded by the per-task turn cap (90 by default; Chapter 1) β a runaway loop terminates rather than silently consuming credits.
Long-running sessions
For tasks lasting more than a few minutes, use an interactive session, which
keeps context alive across turns. For tasks lasting hours, run them in the
background: /background <prompt> (aliases /bg, /btw) runs the prompt in a
separate session and reports results when finished.
Context management
Every LLM has a finite context window. As a conversation grows, the model degrades β it repeats itself, loses earlier detail, stops noticing the obvious β without announcing it.
- Under roughly two hours of active conversation: built-in compression triggers automatically near the limit. It is conservative, so do not rely on it immediately before a critical decision.
- Over roughly two hours: run
/compressdeliberately, at a natural stopping point. Compression summarizes the history and replaces it; the original detail is lost, so compress at a milestone, not at a crisis./compress <focus topic>narrows what is preserved./usageshows where the session stands. - Multi-day projects: do not keep one session alive for days. End each
session by writing a checkpoint file (
/background Write current progress and next steps to ~/checkpoint.md) and read it back next session. A human-readable checkpoint survives any crash; a compressed summary of hundreds of exchanges is still lossy.
Filesystem checkpoints and rollback
For file-editing work, starting a session with --checkpoints causes Hermes to
snapshot files before destructive changes. In-session, /rollback lists and
restores those snapshots; hermes checkpoints manages the store. This is an
undo mechanism for autonomous edits, independent of git.
Verifying output without supervising continuously
hermes chat -q "Check ~/logs/test-results.md and tell me if all tests passed"
hermes chat -q "What's the current status of the content pipeline? Any failures?"
hermes --continue # resume a session left mid-task
When to intervene
Intervene when Hermes asks a question it genuinely cannot resolve; output is visibly wrong and the agent is not self-correcting; the decision is creative or strategic; or a task has been stuck in Recovering for more than about five minutes.
Do not intervene when the agent is processing (let the turn finish), is Waiting (an API call or rate limit is in flight), or is recovering and making progress (allow one full attempt). Reading output as it streams, rather than waiting for completion, is supervision in name only.
Chapter 6: Common Failure Modes
A failure-mode index. Each entry gives cause, fix, and prevention.
1. Context overflow. Symptom: the agent drifts, repeats itself, loses
earlier context. Cause: the context window is near its limit. Fix:
/compress, or /new and reload context from a file or skill. Prevention:
avoid multi-hour sessions; use file-based checkpoints.
2. Toolset mismatch. Symptom: a toolset was enabled but the agent says it
cannot use the capability. Cause: toolsets load at session start. Fix and
prevention: run /new after any toolset change.
3. Configuration drift. Symptom: a config.yaml edit is ignored, or
Hermes crashes on startup. Cause: configuration is read once and cached.
Fix: exit and relaunch (or /restart the gateway). Prevention: use
hermes config set or hermes config edit.
4. Cron delivery surprises. Symptom: a cron job reports success but no
message arrives. Cause: built-in delivery can fail quietly for jobs emitting
structured output, and has no destination if no home channel is set. Fix: set
a home channel with /sethome; for script-backed jobs, post to the platform API
directly. Prevention: use a home channel and direct API delivery.
5. Scratch workspace data loss. Symptom: a Kanban task completed but its
output files are gone. Cause: the default scratch workspace is
garbage-collected when a task is archived. Fix: none β recreate the work.
Prevention: always create output-producing tasks with
--workspace dir:/absolute/path.
6. Delegation or provider mismatch. Symptom: sub-agents fail immediately
with “model not supported”. Cause: the delegation model in config.yaml does
not match what the current provider can serve. Fix: align them, then restart.
Prevention: after changing the main provider, check the delegation config.
7. WSL gateway stops on terminal close. Cause: WSL’s systemd support is
unreliable. Fix and prevention: run the gateway in the foreground inside
tmux (tmux new -s hermes 'hermes gateway run').
8. Profile name mismatch. Symptom: a Kanban task assigned to a profile is
never picked up. Cause: hermes kanban assign can fail to apply if the
profile name does not exactly match. Fix: verify with hermes kanban show <task-id>. Prevention: copy the exact name from hermes profile list.
9. Shared bot token across profiles. Symptom: messaging breaks when multiple profiles are connected. Cause: a messaging platform allows only one connection per bot token. Fix and prevention: give every profile its own bot and token (Chapter 7).
10. Credentials stored in plaintext. Acceptable for a hobby setup, a
liability for production. Fix: use hermes auth credential pools and the
1password skill rather than scattering keys across .env files.
11. The agent overwrites a hand-tuned skill with a worse version. Cause:
the same self-evolution mechanism that improves skills can degrade a manually
customized one. Fix and prevention: pin important hand-authored skills with
hermes curator pin, review what the Curator changes, and use GEPA (Chapter 7)
when you want trace-driven, test-gated improvement rather than the agent’s own
judgment. See also Chapter 8.
Chapter 7: Advanced Configuration
The capabilities below become relevant once the basics are running.
Multi-agent orchestration
Kanban (Chapter 4) is the orchestration layer: decompose a goal into specialist
roles, run them in parallel, synthesize. The two constraints that matter β a
dedicated --workspace dir: per task, and tasks small enough to finish in one
run β are covered in Workflow D and are not repeated here.
Running multiple agents: profiles
Profiles allow multiple fully independent Hermes instances, each with its own
config, memory, skills, sessions, and SOUL.md, sharing nothing by default.
Each profile lives at ~/.hermes/profiles/<name>/.
hermes profile create designer --clone # --clone copies the default profile's config and .env
hermes profile create programmer --clone
hermes profile create researcher --clone
hermes profile use <name> # set the sticky default
hermes -p <name> chat -q "..." # one-off override
hermes profile list / show / rename / export / import
To run several agents on messaging platforms at once, give each profile its own bot β a platform allows only one connection per token, so a shared token breaks. Create one bot per profile and run the gateway wizard once per profile:
hermes -p designer gateway setup
hermes -p programmer gateway setup
hermes -p researcher gateway setup
The agents become genuinely distinct through their SOUL.md files β a designer
profile written for hand-drawn technical illustration, a programmer profile
written as a terse staff engineer, a researcher profile written to produce a
daily digest. Edit each at ~/.hermes/profiles/<name>/SOUL.md.
Delegating execution to Claude Code
A programmer profile is more powerful if it does not write code directly but delegates execution to the Claude Code CLI: Hermes orchestrates and decides what is next, while Claude Code does the file edits, runs commands, and manages git. This also lets execution run on a Claude subscription rather than a separate API key.
Ensure the claude binary is on PATH (which claude should print a real
path), then start a session with the programmer profile and send a single
activation prompt instructing it to act as a staff engineer that uses Claude
Code for all execution and to set itself up accordingly. The profile installs
the claude-code skill on its own, verifies the binary, and from then on routes
anything coding-related through Claude Code β choosing between Claude Code’s
one-shot print mode and its interactive mode based on the task. The same
delegation pattern works for other external CLIs.
Teaching a profile a style by example
The self-evolution loop can be used as a setup mechanism. Rather than
hand-writing a skill, feed a profile reference examples β illustrations,
newsletter intros, code-review comments β and ask it to study them and create a
skill (via skill_manage) that reproduces the pattern, including any script the
skill needs. The agent encodes the pattern itself and verifies the result. From
then on, requests in that domain trigger the skill. This works for anything
where consistency matters.
GEPA: optimizing skills offline
The in-agent learning loop (skill creation plus the Curator) has a known weakness: the agent tends toward self-congratulation β it usually believes it performed well, even when it did not β and the same mechanism that auto-generates skills can overwrite manual customizations with worse versions. The agent is, in effect, grading its own work.
GEPA addresses this. GEPA (Genetic-Pareto Prompt Evolution) is not part of
the Hermes runtime. It lives in a companion repository,
NousResearch/hermes-agent-self-evolution, is MIT-licensed, and is published as
an ICLR 2026 Oral paper. It is an offline optimization pipeline: instead of
asking the agent “did you do well?”, GEPA reads execution traces to understand
why things failed, then proposes targeted improvements through reflective
evolutionary search. It uses DSPy + GEPA, needs no GPU β everything runs through
API calls β and costs roughly $2β10 per optimization run.
The pipeline:
- Read the current skill, prompt, or tool description from the Hermes repo.
- Generate an evaluation dataset β synthetic test cases, real session history
from
state.db, or a hand-curated golden set. - Run the GEPA optimizer: read execution traces, diagnose failure points, generate candidate variants.
- Evaluate candidates with LLM-as-judge scoring against rubrics (graded, not binary pass/fail).
- Apply constraint gates: the full test suite must pass, skills must stay under a size limit, prompt-caching compatibility is preserved, and semantic purpose must not drift.
- The best valid variant goes out as a pull request against the Hermes repo β never a direct commit β for human review and merge.
git clone https://github.com/NousResearch/hermes-agent-self-evolution.git
cd hermes-agent-self-evolution
pip install -e ".[dev]"
export HERMES_AGENT_REPO=~/.hermes/hermes-agent
python -m evolution.skills.evolve_skill --skill <skill-name> --iterations 10 --eval-source synthetic
GEPA can be skipped initially. It earns its keep when you hit a wall with a skill and want trace-driven, test-gated improvement without the cost of fine-tuning or reinforcement learning. It is also still maturing β treat it as an advanced, somewhat experimental companion tool, and review every pull request it produces. In one line: the runtime loop captures experience, the Curator keeps the library clean, and GEPA verifies that what is in the library actually works.
Credential management
Running multiple agents across projects makes credential management a real
concern. hermes auth provides credential pools that hold multiple keys per
provider and rotate them automatically when one hits a rate limit or cooldown.
hermes auth # interactive credential wizard
hermes auth list / status
hermes auth add openrouter --api-key sk-or-... # add an API key
hermes auth add anthropic --type oauth # add an OAuth credential
For secrets that should not be stored in Hermes at all, the official
1password skill fetches credentials from 1Password at runtime
(hermes skills install official/security/1password). Plain keys in .env are
acceptable for a hobby setup; for production β multiple agents, key rotation, an
audit trail β use hermes auth.
MCP server integrations
The Model Context Protocol connects Hermes to external systems β a database, GitHub, anything with an API.
hermes mcp serve # run Hermes itself as an MCP server
hermes mcp add github --command "npx @modelcontextprotocol/server-github"
hermes mcp add <name> --url https://remote-mcp-endpoint
hermes mcp list / test / configure / remove
--command runs a local MCP server process; --url connects to a remote
endpoint. hermes mcp configure filters which of a server’s tools Hermes
exposes. MCP servers are configured per-profile by design.
Extending Hermes: plugins and event hooks
Two mechanisms let you extend Hermes without modifying its core.
Plugins add custom tools, hooks, and integrations. There are three plugin
types: general plugins (which contribute tools or hooks), memory providers (the
external memory backends of Chapter 3’s Tier 3), and context engines
(alternative context-management strategies). Plugins are managed through the
interactive hermes plugins UI and live under ~/.hermes/plugins/.
Event hooks run custom code at lifecycle points. They come in two kinds.
Gateway hooks fire around messaging activity and are the right place for
logging, alerting, and outbound webhooks. Plugin hooks fire around the agent’s
tool calls and are the right place for tool interception, metrics, and
guardrails β for example, blocking or auditing a class of command before it
runs. Hooks live under ~/.hermes/hooks/. Hooks plus webhooks are also how
inbound automation is wired: a webhook can trigger a Hermes run, which is the
basis for patterns such as an automated GitHub pull-request reviewer.
Provider routing and fallback
Beyond choosing one model, Hermes gives fine-grained control over which
provider serves a request. Provider routing supports sorting, whitelists,
blacklists, and priority ordering so requests can be optimized for cost, speed,
or quality. Fallback providers add automatic failover: when the primary
model errors or is rate-limited, Hermes fails over to a backup, with independent
fallback for auxiliary tasks such as vision and context compression. Configure a
chain with hermes fallback add so an unattended job does not stall on a single
provider’s outage. (Prompt caching, discussed in Chapter 8, is a separate,
always-on built-in: a cross-session one-hour prefix cache for Claude on the
native Anthropic, OpenRouter, and Nous Portal providers.)
Using Hermes elsewhere: API server and IDE integration
Hermes is not confined to its own CLI and gateway.
API server. hermes can expose itself as an OpenAI-compatible HTTP
endpoint, so any frontend that speaks the OpenAI format β Open WebUI, LobeChat,
LibreChat, and others β can drive the full agent, tools and memory included.
This is the cleanest way to put a custom or shared UI in front of Hermes.
IDE integration (ACP). Through the Agent Client Protocol, Hermes runs inside ACP-compatible editors including VS Code, Zed, and JetBrains IDEs. Chat, tool activity, file diffs, and terminal commands render inside the editor, which makes Hermes usable as a coding agent without leaving the development environment β complementary to the Claude Code delegation pattern above.
Voice mode and the web dashboard
/voice toggles real-time spoken interaction in the CLI, Telegram, and Discord,
including Discord voice-channel mode. hermes dashboard launches a
browser-based UI for managing configuration, keys, and sessions (requires
pip install hermes-agent[web]); it binds to localhost by default, and the
--insecure flag should be used only behind trusted network controls.
Migrating from OpenClaw
A setup can be migrated from OpenClaw rather than rebuilt. hermes claw migrate
imports persona, memory, skills, providers, messaging tokens, and agent settings
β over 30 categories. The setup wizard also detects ~/.openclaw on first run.
hermes claw migrate --dry-run # preview, write nothing
hermes claw migrate --preset full # all compatible settings, no secrets
hermes claw migrate --preset full --migrate-secrets # include API keys
Secrets are migrated only with --migrate-secrets, and a restore-point snapshot
is written before anything is applied.
Batch processing and research use
Hermes is built by a model-training lab and doubles as a research platform. Batch processing runs the agent across hundreds or thousands of prompts in parallel and emits structured, ShareGPT-format trajectory data β useful for generating training data or for large-scale evaluation. The same trajectory export feeds reinforcement-learning training via Nous Research’s Atropos framework. GEPA, above, is the prompt-and-skill-level counterpart that needs no weight training. Most operators will not use the RL path directly, but batch processing is a practical tool any time the same task must run over a large set of inputs, and the research lineage explains why the harness is engineered as carefully as it is.
Chapter 8: Operational Lessons
The points below are drawn from the official Tips and Best Practices documentation and from independent reviews of production deployments. None is in a feature list; each changes how effectively Hermes performs.
Prompt-cache economics
Most LLM providers cache the system-prompt prefix. When the system prompt stays
stable across a session β same model, same context files, same memory β every
message after the first benefits from a cache hit, substantially cheaper than
a cold read. The corollary is the lesson: do not change the model mid-session,
and do not churn context files, because either invalidates the cache. (This is
also why Tier 1 memory is a frozen snapshot β a mid-session memory write would
otherwise break the cache.) Switch models between sessions. /usage reports
spend within a session; /insights gives a 30-day view.
Specify the goal, then delegate the steps
Two opposite failure modes occur with prompting. The vague prompt β “fix the code” β produces a vague fix and several rounds of clarification; front-load detail and paste tracebacks directly. The micromanaged prompt β dictating each step β wastes the agent’s actual strength; “find and fix the failing test” lets it search, run, and iterate. Be specific about the goal; let the agent determine the steps.
Skills are created but not always used
Hermes generates skills, but the agent decides when to load them. It may judge a
skill unnecessary and skip it, or load it and use only part of it. A large
collection of auto-generated skills is therefore not equivalent to a faster
agent. Two habits address this: invoke skills that genuinely matter explicitly
with /<skill-name> rather than relying on the agent to reach for them, and
audit created skills periodically with hermes skills list and
hermes curator run --dry-run. The compounding benefit is real β agents with a
substantial set of self-created skills complete similar tasks markedly faster β
but only when the skills are sound and actually used.
Self-improvement has no inherent ground truth
A self-improving agent improves toward whatever feedback signal it receives. In
domains with clear feedback β code that compiles or fails, tests that pass or
fail β the loop works. In ambiguous domains, or where the operator cannot judge
correctness, there is no reliable ground truth, and the agent can become faster
and more confident at the wrong thing. The agent also tends to rate its own
performance generously. Defenses: review the skills the Curator creates and
keeps; pin sound hand-authored skills (hermes curator pin) so they are not
silently degraded; and, for skills that matter, prefer GEPA’s trace-driven,
test-gated optimization (Chapter 7) over the agent’s self-assessment. Do not
assume “it learned” means “it learned the correct thing.”
Choose a deliberate loop position
A useful frame distinguishes three positions: in the loop (each step is approved), on the loop (the operator supervises and intervenes), and out of the loop (the agent runs unattended). Hermes’s defaults place the operator on the loop for outputs and out of the loop for the learning, and the path of least resistance pulls toward fully out-of-the-loop. That is acceptable where feedback is crisp and a real risk where it is not. Decide deliberately which position each workflow warrants.
Security for an agent with shell access
An agent that runs shell commands unattended needs a deliberate security posture.
- Keep dangerous-command approval enabled. Hermes checks every command against a curated list of dangerous patterns. When it prompts, four choices appear: once, session, always, deny. Choose always with caution β it permanently allowlists the pattern. Begin with session.
- Container backends skip those checks. With Docker, Singularity, Modal, or Daytona, dangerous-command checks are disabled because the container is the security boundary β so the container image must itself be locked down.
- Sandbox untrusted code. When working with an unfamiliar repository, set
TERMINAL_BACKEND=dockerso a harmful command cannot reach the host. - Never set
GATEWAY_ALLOW_ALL_USERS=trueon a bot with terminal access. Use per-platform allowlists (TELEGRAM_ALLOWED_USERS,DISCORD_ALLOWED_USERS) or DM pairing (hermes pairing approve). - Account for the skill and MCP supply chain. Auto-created skills, community
skills, and MCP servers all execute with the agent’s privileges. Inspect
skills before installing (
hermes skills inspect), and do not point an unsandboxed Hermes instance at a payment or otherwise regulated codebase until its provenance, signing, and audit-trail story is mature.
The consensus from independent reviews is that Hermes is a strong always-on personal agent for individual developers, indie builders, and researchers, but is not yet suited to regulated backend engineering. Match the deployment to the stakes.
Choosing a model for the harness
Hermes is designed so a strong harness makes open or budget models perform at
operator grade, and in practice this largely holds. The practical pattern: a
frontier model (Claude Sonnet/Opus class, GPT class) for architecture and
difficult multi-step reasoning, a fast inexpensive model (Claude Haiku,
DeepSeek) for formatting and boilerplate. Switching is trivial, but the
prompt-cache lesson applies β switch between sessions. Configure a fallback
chain with hermes fallback add so a rate-limited primary does not stall an
unattended job.
CLI reflexes worth building
Ctrl+C pressed once interrupts the agent so it can be redirected mid-thought.
Ctrl+V pastes a clipboard image directly for vision analysis. Alt+Enter or
Ctrl+J inserts a newline without sending. Typing / then Tab autocompletes
every command and installed skill. /title on every session worth finding again
prevents an indistinguishable pile of unnamed sessions.
Appendix: Quick Reference
Install and setup
# Install (Linux / macOS / WSL2 / Android-Termux)
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
# Install (native Windows, PowerShell β early beta)
iex (irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1)
hermes setup # configure (sections: model, terminal, gateway, tools, agent)
hermes doctor # health check (add --fix to attempt repairs)
hermes status # visual status overview
hermes dump # plain-text setup summary for support requests
hermes update # update (add --backup for a pre-update snapshot)
~/.hermes/ layout (key paths)
config.yaml non-secret configuration
.env API keys and secrets
SOUL.md agent identity (system-prompt slot #1)
memories/MEMORY.md Tier 1 memory β agent facts (~2,200 chars)
memories/USER.md Tier 1 memory β user model (~1,375 chars)
skills/ all skills; .archive/ holds Curator-archived skills
state.db SQLite session store, FTS5 β Tier 2 memory / search
cron/jobs.json scheduled jobs
profiles/<name>/ isolated profiles, each a full Hermes home
logs/ agent.log, gateway.log, errors.log
Toolset and configuration rules
- Toolset changes take effect only in a new session (
/new). - Configuration changes require a restart (or
/restartfor the gateway). - Enable only the toolsets a task needs.
Session commands
hermes chat -q "one-shot query" # one-shot; shows tool output
hermes -z "scripted one-shot" # final answer only β for scripts, cron, CI
hermes chat # interactive session (or just: hermes)
hermes --continue # resume the most recent session
hermes --resume <id-or-title> # resume a specific session
hermes sessions list / browse / rename / prune / export / delete
Key in-session slash commands
/new (alias /reset) start a fresh session
/compress [focus] compress context manually
/background <prompt> run a prompt in a separate background session
/rollback [n] list or restore filesystem checkpoints
/model [name] switch among already-configured models
/skills search, install, and manage skills
/<skill-name> load an installed skill (e.g. /python-testing)
/cron manage scheduled tasks (see cron forms below)
/sethome set the current chat as the home channel for deliveries
/title <name> name the current session
/voice [on|off|status] toggle voice mode
/usage token usage and cost for the session
/verbose cycle tool-output display modes
/help full command list
There is no /skill command. Load a skill with /<skill-name>; manage skills
with /skills.
CLI keyboard shortcuts
Ctrl+C (once) interrupt the agent β then type to redirect
Ctrl+C (twice/2s) force exit
Alt+Enter / Ctrl+J newline without sending (works in every terminal)
Ctrl+V paste a clipboard image
/ then Tab autocomplete commands and installed skills
Context files and references
~/.hermes/SOUL.md instance-wide identity (system-prompt slot #1)
AGENTS.md project root β rules and conventions, auto-loaded each session
.hermes.md CLAUDE.md also recognized as project context files
.cursorrules read automatically if present in the working directory
@<path|folder|url> inject a file, folder, git diff, or URL into one message
Skill and Curator commands
hermes skills browse / search # explore registries
hermes skills install <id> # install a skill
hermes skills inspect <id> # preview without installing
hermes skills list / publish <path>
hermes skills tap add <user>/<repo> # add a GitHub repo as a custom tap
hermes bundles create <name> --skill <id> ... # group skills under one command
hermes curator run --dry-run # preview a Curator pass
hermes curator pin <skill> # protect a skill from archival
Cron
# In-session (/cron add):
/cron add 30m "..." # one-shot in 30 minutes
/cron add "every 2h" "..." # recurring interval
/cron add "0 9 * * 1-5" "..." # standard cron expression
/cron add "every 1h" "..." --skill <name> # attach a skill
# From a shell:
hermes cron list / create / edit / pause / resume / run / remove / status
Reliable messaging delivery (cron script pattern)
#!/bin/bash
TELEGRAM_BOT_TOKEN=$(grep TELEGRAM_BOT_TOKEN ~/.hermes/.env | cut -d= -f2)
TELEGRAM_CHAT_ID=$(grep TELEGRAM_CHAT_ID ~/.hermes/.env | cut -d= -f2)
RESULT=$(hermes -z "Your query here")
curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
-d "chat_id=${TELEGRAM_CHAT_ID}" -d "text=${RESULT}" -d "parse_mode=Markdown"
A home channel must also be set with /sethome for built-in delivery.
Gateway and profile commands
hermes gateway setup / run / install / start / stop / restart / status / list
hermes profile list / create <name> [--clone] / use <name> / show / rename
hermes -p <name> <command> # run any command under a specific profile
Credentials, memory, MCP
hermes auth # credential pool wizard
hermes auth add <provider> --api-key <key> | --type oauth
hermes memory setup / status # configure an external memory provider
hermes mcp serve / add / list / test / configure / remove
Extensibility and integration
hermes plugins # manage plugins (tools, memory providers, context engines)
hermes fallback add <provider> # add a fallback provider for failover
# Event hooks live under ~/.hermes/hooks/ (gateway hooks and plugin hooks)
# API server: expose Hermes as an OpenAI-compatible HTTP endpoint
# IDE (ACP): use Hermes inside VS Code, Zed, or JetBrains editors
GEPA β offline skill optimization (companion repo)
git clone https://github.com/NousResearch/hermes-agent-self-evolution.git
cd hermes-agent-self-evolution && pip install -e ".[dev]"
export HERMES_AGENT_REPO=~/.hermes/hermes-agent
python -m evolution.skills.evolve_skill --skill <skill-name> --iterations 10 --eval-source synthetic
# Output: a pull request against the hermes-agent repo. Review before merging.
Emergency recovery
Hermes will not start: hermes doctor --fix ; check ~/.hermes/logs/
Tool unavailable after enable: /new
Config change has no effect: exit and relaunch (or /restart the gateway)
Cron job not firing: hermes cron status ; hermes cron list
Gateway not responding: hermes logs gateway -f ; check the bot token ;
on WSL run `hermes gateway run` inside tmux
A file edit went wrong: /rollback
Kanban scratch files gone: unrecoverable β always use --workspace dir:/abs/path
A skill was degraded: hermes curator pin <skill> ; restore from .archive/
Official resources
Documentation hermes-agent.nousresearch.com/docs
Source github.com/NousResearch/hermes-agent
Self-evolution github.com/NousResearch/hermes-agent-self-evolution
Skills hub agentskills.io / skills.sh
LLM-readable docs /docs/llms.txt and /docs/llms-full.txt
Built around Hermes Agent by Nous Research (MIT License). Verified against official documentation and source repositories. Command flags, defaults, and counts change between releases; confirm details against the current docs.
