Hermes Agent Architecture & Features

Understanding Hermes's internals helps you configure, extend, and troubleshoot it more effectively. This chapter breaks down its overall architecture and key design choices.


Overall Architecture

Hermes is built from loosely coupled subsystems that cooperate around a central Agent loop:

┌──────────────────────────────────────────────────────────────┐
│                     Entry / Reach Layer                       │
│   CLI Terminal UI   │   Message Gateway (Telegram/Discord…)   │
└───────────────┬──────────────────────────┬───────────────────┘
                │                          │
                ▼                          ▼
┌──────────────────────────────────────────────────────────────┐
│                        Agent Loop                            │
│  read context → call LLM → parse tool calls → run → write back│
└───┬───────────┬───────────┬───────────┬───────────┬──────────┘
    │           │           │           │           │
    ▼           ▼           ▼           ▼           ▼
 ┌──────┐  ┌────────┐  ┌────────┐  ┌────────┐  ┌──────────┐
 │Memory│  │ Skills │  │ Tools  │  │  Cron  │  │ Backend  │
 │MEMORY│  │SKILLS  │  │ TOOLS  │  │scheduler│ │Local/Docker│
 └──────┘  └────────┘  └────────┘  └────────┘  └──────────┘
    │                                              │
    ▼                                              ▼
 ┌──────────────────────────┐          ┌────────────────────────┐
 │  ~/.hermes/ local store   │          │  LLM Provider (model)  │
 │  config / SOUL / MEMORY  │          │  Portal/OpenRouter/vLLM│
 └──────────────────────────┘          └────────────────────────┘

The Agent Loop

At its heart Hermes runs a classic sense–think–act loop:

  1. Assemble context: load SOUL (persona), relevant memories, available tools/skills, current history
  2. Call the LLM: hand the context to the chosen model; get a reply or a tool-call request
  3. Parse & execute: if the model requests a tool, run it in the selected execution backend
  4. Write back: feed tool output back to the model and keep reasoning until the task is done
  5. Consolidate: update memory and generate new skills as needed

Hermes supports 11 tool-call parsers to handle the different ways models emit tool calls — a key enabler of its model-agnostic design.


Local Storage Layout: ~/.hermes/

All state lives under your home directory, making backup, migration, and auditing easy:

~/.hermes/
├── config.yaml          # Main configuration (model, tools, platforms, backend…)
├── SOUL.md              # Agent persona / system prompt
├── MEMORY.md            # Distilled facts and patterns (long-term memory)
├── USER.md              # User profile and preferences
├── skills/              # Custom and auto-generated skills
├── conversations/       # Session histories
└── cache/               # Cached data

Override the location with the HERMES_CONFIG_DIR environment variable.


A Clean Three-Way Separation

Hermes cleanly separates "who it is, what it remembers, what it can do":

FileRoleAnalogy
SOUL.mdPersona, values, styleThe agent's "personality"
MEMORY.mdLearned facts and patternsThe agent's "experience"
USER.mdYour preferences and contextWhat it knows about you

This separation lets you tune any layer independently — swapping personas won't lose memory, and pruning memory won't change the persona. See Memory System.


Execution Backends

Tool commands don't always run bare on your host. Hermes offers 6 switchable execution backends, balancing convenience and isolation:

BackendDescriptionUse case
LocalRuns directly on the hostPersonal use, fastest
DockerRuns in a containerIsolate risk, reproducible
SSHRuns on a remote serverOperating remote machines
SingularityHPC clustersScientific computing
ModalServerless, hibernates when idleElastic, cost-saving
DaytonaServerless, persistentPersistent workspaces

Execution backends are a core part of Hermes's security: putting risky operations in a container or remote sandbox sharply limits blast radius.


The Gateway

The gateway is the adapter layer that connects external messaging platforms. It translates messages from Telegram, Discord, etc. into Agent-loop input, and sends replies back to the right platform:

[User sends on Telegram] → gateway adapter → Agent loop → reply → gateway → [Telegram]

A single gateway can serve multiple platforms, with DM pairing and user allowlists for access control. See Message Channels.


Key Features at a Glance

  • Model-agnostic: 11 tool parsers + 200+ models, switch anytime with /model
  • Local-first: all state in ~/.hermes/, no cloud lock-in
  • Clean layering: persona / memory / user are independently tunable
  • Isolatable execution: 6 backends chosen by risk level
  • Proactive scheduling: built-in Cron for unattended runs
  • Capable of growth: skills auto-distill, prompts can evolve (GEPA)

Next Steps