Hermes Agent Architecture & Features

Understanding Hermes's internals helps you configure, extend, and troubleshoot it more effectively. This chapter breaks down its overall architecture and key design choices.

Overall Architecture

Hermes is built from loosely coupled subsystems that cooperate around a central Agent loop:

┌──────────────────────────────────────────────────────────────┐
│                     Entry / Reach Layer                       │
│   CLI Terminal UI   │   Message Gateway (Telegram/Discord…)   │
└───────────────┬──────────────────────────┬───────────────────┘
                │                          │
                ▼                          ▼
┌──────────────────────────────────────────────────────────────┐
│                        Agent Loop                            │
│  read context → call LLM → parse tool calls → run → write back│
└───┬───────────┬───────────┬───────────┬───────────┬──────────┘
    │           │           │           │           │
    ▼           ▼           ▼           ▼           ▼
 ┌──────┐  ┌────────┐  ┌────────┐  ┌────────┐  ┌──────────┐
 │Memory│  │ Skills │  │ Tools  │  │  Cron  │  │ Backend  │
 │MEMORY│  │SKILLS  │  │ TOOLS  │  │scheduler│ │Local/Docker│
 └──────┘  └────────┘  └────────┘  └────────┘  └──────────┘
    │                                              │
    ▼                                              ▼
 ┌──────────────────────────┐          ┌────────────────────────┐
 │  ~/.hermes/ local store   │          │  LLM Provider (model)  │
 │  config / SOUL / MEMORY  │          │  Portal/OpenRouter/vLLM│
 └──────────────────────────┘          └────────────────────────┘

The Agent Loop

At its heart Hermes runs a classic sense–think–act loop:

Assemble context: load SOUL (persona), relevant memories, available tools/skills, current history
Call the LLM: hand the context to the chosen model; get a reply or a tool-call request
Parse & execute: if the model requests a tool, run it in the selected execution backend
Write back: feed tool output back to the model and keep reasoning until the task is done
Consolidate: update memory and generate new skills as needed

Hermes supports 11 tool-call parsers to handle the different ways models emit tool calls — a key enabler of its model-agnostic design.

Local Storage Layout: `~/.hermes/`

All state lives under your home directory, making backup, migration, and auditing easy:

~/.hermes/
├── config.yaml          # Main configuration (model, tools, platforms, backend…)
├── SOUL.md              # Agent persona / system prompt
├── MEMORY.md            # Distilled facts and patterns (long-term memory)
├── USER.md              # User profile and preferences
├── skills/              # Custom and auto-generated skills
├── conversations/       # Session histories
└── cache/               # Cached data

Override the location with the HERMES_CONFIG_DIR environment variable.

A Clean Three-Way Separation

Hermes cleanly separates "who it is, what it remembers, what it can do":

File	Role	Analogy
SOUL.md	Persona, values, style	The agent's "personality"
MEMORY.md	Learned facts and patterns	The agent's "experience"
USER.md	Your preferences and context	What it knows about you

This separation lets you tune any layer independently — swapping personas won't lose memory, and pruning memory won't change the persona. See Memory System.

Execution Backends

Tool commands don't always run bare on your host. Hermes offers 6 switchable execution backends, balancing convenience and isolation:

Backend	Description	Use case
Local	Runs directly on the host	Personal use, fastest
Docker	Runs in a container	Isolate risk, reproducible
SSH	Runs on a remote server	Operating remote machines
Singularity	HPC clusters	Scientific computing
Modal	Serverless, hibernates when idle	Elastic, cost-saving
Daytona	Serverless, persistent	Persistent workspaces

Execution backends are a core part of Hermes's security: putting risky operations in a container or remote sandbox sharply limits blast radius.

The Gateway

The gateway is the adapter layer that connects external messaging platforms. It translates messages from Telegram, Discord, etc. into Agent-loop input, and sends replies back to the right platform:

[User sends on Telegram] → gateway adapter → Agent loop → reply → gateway → [Telegram]

A single gateway can serve multiple platforms, with DM pairing and user allowlists for access control. See Message Channels.

Key Features at a Glance

Model-agnostic: 11 tool parsers + 200+ models, switch anytime with /model
Local-first: all state in ~/.hermes/, no cloud lock-in
Clean layering: persona / memory / user are independently tunable
Isolatable execution: 6 backends chosen by risk level
Proactive scheduling: built-in Cron for unattended runs
Capable of growth: skills auto-distill, prompts can evolve (GEPA)

Next Steps

Installation & Usage — get this architecture running
Tool System — dive into 40+ tools and backends
Memory System — how three-layer memory retrieves and consolidates

#Hermes Agent Architecture & Features

#Overall Architecture

#The Agent Loop

#Local Storage Layout: ~/.hermes/

#A Clean Three-Way Separation

#Execution Backends

#The Gateway

#Key Features at a Glance

#Next Steps