AI Agent 2026.06.18

Hermes Agent Skills Advanced Guide: From SKILL.md to GEPA Self-Evolution

JEX

JEXCLOUD Engineering team

· June 18, 2026 · About 32 minutes to read

In early 2026, Nous Research's Hermes Agent crossed 160K GitHub stars in two months. Its core idea is "the agent that grows with you"—an agent that gets smarter the more you use it. The foundation is the Skills system: standardized, evolvable, cross-session procedural memory—not a one-off prompt.

For developers already running Hermes, this guide covers the full advanced picture: ① how Skills differ from Memory and Prompt, and how Progressive Disclosure controls token cost; ② SKILL.md format, Skill Bundles, conditional activation, and Tap publishing; ③ GEPA + DSPy five-stage self-evolution and the community ecosystem. After reading, you can write, bundle, publish, and evolve your own skill assets independently.

01 Why Hermes Agent's Skills system deserves dedicated study

Getting-started tutorials answer "how to install." Advanced work answers "how to make the agent stronger over time." Hermes Skills stand out on four axes:

On-demand loading: zero token cost before activation; Progressive Disclosure keeps spend predictable.
Open standard: follows agentskills.io—skills reuse across Hermes, Claude Code, and Cursor.
Composable: Skill Bundles load a full workflow with one slash command.
Evolvable: GEPA analyzes execution traces and improves SKILL.md text without touching model weights.

Four pain points advanced users hit most often:

Token bloat: stuffing every SOP into the system prompt burns thousands of tokens every session.
Wrong skill activation: vague descriptions cause the LLM to load the wrong skill in unrelated contexts.
Fragmented workflows: PR review, TDD, and deploy each need a separate /skill-name—slow and tedious.
No team sharing: skills live in personal folders; onboarding on a new machine is painful.

02 Skills, Memory, and Prompt: what is the difference?

Skills vs Memory vs Prompt
Dimension	Plain Prompt	Memory	Skills
Persistence	Current conversation	Cross-session, permanent	Cross-session, permanent
Load timing	Always in context	Auto-injected each session	On demand
Token cost	Every turn	Small and stable	Zero before activation
Content type	Any intent description	User preferences / facts	Procedural steps
Maintained by	User manually	Agent automatically	User and agent
Shareability	Awkward	Private	Publishable as community Tap

Memory aid: Prompt = sticky note (valid this turn); Memory = notebook (permanent notes, always nearby); Skill = SOP manual (step-by-step process, opened when needed).

Skills complement MCP: MCP provides tool interfaces (e.g., database access); Skills teach the agent how to use those tools correctly for tasks like migrations.

03 SKILL.md format and Progressive Disclosure

All Hermes Skills follow the agentskills.io open standard. Basic frontmatter structure:

SKILL.md

---
name: my-skill
description: |
  Use when the user needs to [...].
version: 1.0.0
license: MIT
compatibility: Requires git, docker
allowed-tools: Bash(git:*) Read
metadata:
  hermes:
    tags: [devops, automation]
    category: software-development
    related_skills: [github-pr-workflow]
    requires_toolsets: [terminal]
    fallback_for_toolsets: [web]
---

# My Skill Title
## Overview / When to Use / Procedure / Common Pitfalls / Verification Checklist

Recommended directory layout:

~/.hermes/skills/

my-category/my-skill/
├── SKILL.md              # core steps; aim for ≤500 lines
├── references/           # API refs; loaded on demand
├── templates/            # reusable templates
└── scripts/              # scripts the agent can run directly

Progressive Disclosure three-level loading
Level	Content	Trigger	Token cost
Level 0	name + description	Every session start, all skills	~3K total across all skills
Level 1	Full SKILL.md body	`/skill-name` or LLM decides needed	Depends on file length
Level 2	references/ scripts/ files	LLM decides at execution time	On demand, per file

Writing tips: description is all Level 0 sees—"when to use" beats "what it is"; SKILL.md should include Overview, When to Use, Procedure, Common Pitfalls, and Verification Checklist. Validate with skills-ref validate ./my-skill.

04 Skill Bundles: one command for a full workflow

Skill Bundles are a Hermes 2026 feature: lightweight YAML packs multiple skills into one slash command. Running /bundle-name loads every listed skill at once.

File location: ~/.hermes/skill-bundles/<slug>.yaml

backend-dev.yaml

name: backend-dev
description: Full backend feature workflow — code review, TDD, and PR management.
skills:
  - github-code-review
  - test-driven-development
  - github-pr-workflow
instruction: |
  Always write failing tests first before implementation.
  Never push directly to main.

Advanced scenarios: research workflows can bundle arxiv, deep-research, plan, and excalidraw; MLOps deploy can bundle vllm, llama-cpp, github-pr-workflow, and systematic-debugging.

Priority rules: when a Bundle and single Skill share a name, the Bundle wins; missing skills are skipped with a warning, not an error; Bundles do not alter the system prompt, so Prompt Cache stays valid.

CLI

hermes bundles create backend-dev \
  --skills github-code-review,test-driven-development,github-pr-workflow \
  --instruction "Always write failing tests first"

05 Conditional activation: skills that sense the environment

Under metadata.hermes, four activation rules let skills show or hide based on tool availability:

Conditional activation fields and behavior
Field	Behavior
`requires_toolsets`	Hide skill when listed toolsets are missing
`requires_tools`	Hide skill when listed tools are missing
`fallback_for_toolsets`	Hide skill when listed toolsets exist (fallback path)
`fallback_for_tools`	Hide skill when listed tools exist

Classic scenario: a DuckDuckGo search skill sets fallback_for_tools: [web_search]—when the user configures FIRECRAWL_KEY or BRAVE_SEARCH_KEY, paid web_search activates and DuckDuckGo hides to save tokens; when the API is unavailable, the fallback surfaces automatically.

Platform awareness: telegram-notify can set requires_toolsets: [messaging] and platforms: [telegram, discord]; via the hermes skills TUI you can toggle skills independently for CLI, Telegram, and Discord.

06 Skills Hub and the open-source ecosystem

Official install channels:

hermes skills

hermes skills install official/research/arxiv
hermes skills install https://example.com/SKILL.md --name my-skill
hermes skills install github:openai/skills/k8s
hermes skills tap add github:my-org/my-skills

Notable open-source skill repositories
Repository	Highlights
awesome-hermes-skills	Curated production skills: Deep Research, MLOps, Apple integration; 23 skills wired for GitHub Copilot
hermeshub	Community registry with security scanning and certification; API and marketplace support
ai-agent-skills	191 skills across 28 categories; one-click install for Hermes, Claude Code, and Cursor
hermes-agent	Official source of truth: all built-in skills and authoring conventions

agentskills.io means skills work across Hermes, Claude Code, Cursor, and OpenCode—your assets are not locked to one platform.

07 Publish your Skill Tap: six steps for team and community sharing

A GitHub repo as a Tap lets teams or communities subscribe to your skill set. Recommended repo layout:

my-skills-tap/

my-skills-tap/
├── skills.sh.json          # category config (optional)
├── mlops/vllm-deploy/SKILL.md
├── research/paper-summarizer/SKILL.md
└── README.md

Plan categories: organize by domain (MLOps, Research, etc.); write skills.sh.json to control Hub display groups.
Write SKILL.md files: one directory per skill; validate with skills-ref validate.
Push to GitHub: public or private (private needs a token).
Team subscribes: hermes skills tap add github:your-org/your-skills-tap.
Update regularly: hermes skills tap update pulls the latest skills.
Version control: put ~/.hermes/skills/ in Git; sync across devices with git pull && hermes skills reset.

Tap management

hermes skills tap add github:your-org/private-skills --token $GH_TOKEN
hermes skills tap list
hermes skills tap update

08 Self-evolving Skills: GEPA + DSPy automatic improvement

GEPA (Genetic-Pareto Prompt Evolution) is a 2026 ICLR Oral result, integrated in hermes-agent-self-evolution. Core idea: no model fine-tuning—analyze execution traces, generate variants, and apply multi-objective Pareto optimization to improve skill text. Each optimization run costs roughly $2–10 (API only, no GPU).

Five-stage evolution flow: ① execution trace collection (SQLite); ② reflective failure analysis (actionable side information); ③ targeted mutation (10–20 SKILL.md variants); ④ multi-objective Pareto evaluation (success rate × token efficiency × speed); ⑤ human PR review before merge.

evolve_skill

export HERMES_AGENT_PATH=~/.hermes
python -m evolution.skills.evolve_skill \
    --skill github-code-review \
    --iterations 10 \
    --eval-source sessiondb

Four safety guardrails: full test suite must pass 100%; Skills ≤ 15KB, tool descriptions ≤ 500 chars; Prompt cache compatible; semantic preservation check so purpose does not drift.

Official five-phase evolution roadmap
Phase	Optimization target	Status
Phase 1	Skill files (SKILL.md)	Shipped
Phase 2	Tool descriptions	Planned
Phase 3	System prompt fragments	Planned
Phase 4	Tool implementation code	Planned
Phase 5	Continuous improvement loop (fully automated)	Planned

Because Skills follow agentskills.io, you can feed Claude Code or Gemini CLI traces to the optimizer: --eval-source mixed --trace-dirs ~/.claude/traces,~/.hermes/sessions.

09 Plugin skills and advanced authoring tips

Plugins pack skills under a namespace plugin:skill: they do not appear in the default skills_list, activate only on explicit user call, and skills within a plugin can cross-reference. Loading skill_view("superpowers:writing-plans") also surfaces sibling skills in the same plugin.

Description drives activation accuracy: avoid vague lines like "Helps with code"; state trigger conditions and exclusion cases clearly.

Pitfalls separate good from great: list concrete failure modes, root causes, and fixes (e.g., fragile CSS selectors, GitHub API rate limits, large diff token overflow).

Scripting: reference executable scripts under scripts/ in Procedure; on failure, fall back to references/manual-extract.md.

Skill size guidelines
Size	Recommendation
< 500 lines	Keep everything in SKILL.md
500–1000 lines	Move detail to references/
> 1000 lines	Split strongly; consider two skills
> 15KB	Exceeds GEPA limit; must split

The agent can dynamically patch or create skills via skill_manage; set skills.agent_writes_require_approval: true in config.yaml for a human approval gate.

10 Case study: tech blog workflow Skills design

Build a blog-workflow Bundle that loads SEO research, outline generation, code validation, bilingual check, and publish skills in one shot:

blog-workflow.yaml

name: blog-workflow
description: Full tech blog writing workflow.
skills:
  - seo-keyword-research
  - outline-generator
  - code-example-validator
  - bilingual-checker
  - publish-to-platform
instruction: |
  Always research SEO keywords before writing.
  Ensure all code examples are tested and runnable.
  Generate both Chinese and English title options.

A custom seo-keyword-research skill should set requires_toolsets: [web]. The flow: identify topic → Chinese long-tail ("how to use X", "X tutorial") → English long-tail ("X tutorial", "X vs Y") → cross-reference Juejin/Dev.to/HN trending → output 3–5 primary keywords plus a 10–15 long-tail matrix. Chinese and English audiences search differently; validate technical term translations per target platform.

11 Hermes Agent Skills FAQ

How do Skills differ from MCP? Skills are procedural knowledge documents; MCP is a tool interface—they complement each other.
Why does my edited Skill still run the old version? Changes do not apply in the current session; start a new session with /reset, or install with --now (invalidates Prompt Cache).
Is GEPA evolution safe? Four guardrails plus human PR review—but still review every diff.
How to reuse in Claude Code? Copy SKILL.md to ~/.claude/skills/, or use ai-agent-skills for one-click multi-platform install.
Does Chinese content affect tokens? Roughly 1–1.5 tokens per Chinese character; keep descriptions in English or bilingual for sharper LLM matching.

Further reading: official docs, Chinese docs, GEPA algorithm, DSPy framework.

12 Hard data and JEXCLOUD wrap-up

GitHub stars: Hermes Agent launched early 2026; crossed 160K stars within two months.
Level 0 tokens: all skill name+description fields total ~3K tokens per session.
GEPA per-run cost: roughly $2–10, API-only, no GPU required.
GEPA size limits: Skills ≤ 15KB; tool descriptions ≤ 500 characters.
Community scale: kevinnft/ai-agent-skills has 191 skills in 28 categories; hermeshub has 166 stars with security scanning.

Running Hermes Agent and GEPA evolution pipelines needs a 24/7 online, low-latency macOS host. Raspberry Pi runs out of RAM; oversubscribed shared VPS drops long connections; home broadband jitter—all of that degrades Skills trace collection and Gateway uptime.

For production environments that need a stable Hermes Gateway, continuous sessiondb trace collection, and GEPA iteration, JEXCLOUD multi-region bare-metal Macs are the stronger choice: dedicated Apple Silicon, 24/7 uptime, flexible monthly scaling, 120-second node delivery. Configs and pricing: JEXCLOUD pricing.

Back to blog list

Tags: Hermes Agent Skills GEPA Skill Bundles Cloud Mac