Hermes Agent 源码解析:A Closed Learning Loop 是如何让 Agent 越用越聪明的?
最近 Hermes Agent 势头很猛在推特上异常火爆并且今天登上了 Github Trending top1当前 50.6K star涨势凶猛。Hermes Agent 是今年 2月 Nous Research 开源的自托管智能体框架它被认为是 OpenClaw 上线以来第一个真正意义上的竞争对手我自己用了一周也觉得这个 “爱马仕” 比 “小龙虾” 似乎更聪明。Hermes Agent 与 OpenClaw 设计的最大不同是它的设计自带就拥有一套闭环学习循环A Closed Learning Loop通过触发 - review - 写回 - 再注入把 self-improving 飞轮真正意义上跑通。本文我们结合源码一起探索下面两个问题的答案为什么 Hermes Agent 能越用越聪明Hermes 到底是怎么把 closed learning loop 做成一个能持续运转的 self-improving 飞轮的系统架构概览如果先把模型和具体任务都放一边Hermes 整个运行系统大致可以拆成三层用户界面层、核心代理层、执行后端层。真正把 closed learning loop 跑起来的核心主要都落在中间这一层尤其是AIAgent、PromptBuilder、SessionDB和工具系统的衔接上。用户发出 prompt 后Hermes 是怎么运转起来的看完整体架构之后再顺一遍主执行链路后面再看 closed learning loop 是怎么插进去的会更容易理解。学习闭环循环是如何转起来的 (A Closed Learning Loop)Self-Improving 飞轮机制1. 经验积累阶段•对话体验: Agent 通过与用户交互获得经验•Memory Nudge: 每10轮对话触发记忆审查•Skill Nudge: 每15次工具迭代触发技能审查2. 知识提取阶段•MEMORY_REVIEW_PROMPT: 评估用户偏好、期望和重要信息•SKILL_REVIEW_PROMPT: 评估复杂任务、试错过程和可重用方法3. 知识固化阶段•memory_tool: 更新 MEMORY.md环境事实和 USER.md用户档案•skill_manage: 创建新技能或更新现有技能4. 能力提升阶段•增强用户建模: 更好的个性化服务•扩展知识库: 更高效的任务执行5. 飞轮闭环提升的能力带来更好的对话体验形成正向循环。关键要点• 整个学习循环通过spawn_background_review在后台异步执行• Memory 系统提供声明性知识知道什么Skills 系统提供程序性知识知道怎么做• 两个系统相互补充共同构建 Agent 的智能基础• 这是一个真正的自改进系统无需人工干预即可持续进化Self-Improving 飞轮实现spawn_background_review 触发流程前面的图是抽象层真正落到源码里closed learning loop 的关键入口就是_spawn_background_review。触发之后它会异步执行 review不影响主对话流程。系统维护两个计数器决定何时触发review•_turns_since_memory- 距离上次记忆review的轮数•_iters_since_skill- 距离上次技能review的轮数Skill Nudge基于工具调用次数触发Skill nudge 适合在复杂任务后创建可重用的技能•默认触发间隔每15 次工具调用触发一次•配置位置config.yaml中的skills.creation_nudge_interval默认值15•计数器_iters_since_skill在每次工具调用后递增Memory Nudge基于对话轮次触发Memory nudge 适合定期保存用户偏好和重要信息•默认触发间隔每10 轮对话触发一次•配置位置config.yaml中的memory.nudge_interval默认值是10•计数器_turns_since_memory在每轮对话开始时递增spawn_background_review 核心代码这里会按需进行 review_memory 以及 review_skillsdef _spawn_background_review( self, messages_snapshot: List[Dict], review_memory: bool False, review_skills: bool False, ) - None: Spawn a background thread to review the conversation for memory/skill saves. Creates a full AIAgent fork with the same model, tools, and context as the main session. The review prompt is appended as the next user turn in the forked conversation. Writes directly to the shared memory/skill stores. Never modifies the main conversation history or produces user-visible output. import threading # Pick the right prompt based on which triggers fired if review_memory and review_skills: prompt self._COMBINED_REVIEW_PROMPT elif review_memory: prompt self._MEMORY_REVIEW_PROMPT else: prompt self._SKILL_REVIEW_PROMPT def _run_review(): import contextlib, os as _os review_agent None try: with open(_os.devnull, w) as _devnull, \ contextlib.redirect_stdout(_devnull), \ contextlib.redirect_stderr(_devnull): review_agent AIAgent( modelself.model, max_iterations8, quiet_modeTrue, platformself.platform, providerself.provider, ) review_agent._memory_store self._memory_store review_agent._memory_enabled self._memory_enabled review_agent._user_profile_enabled self._user_profile_enabled review_agent._memory_nudge_interval 0 review_agent._skill_nudge_interval 0 review_agent.run_conversation( user_messageprompt, conversation_historymessages_snapshot, ) # Scan the review agents messages for successful tool actions # and surface a compact summary to the user. actions [] for msg in getattr(review_agent, _session_messages, []): if not isinstance(msg, dict) or msg.get(role) ! tool: continue try: data json.loads(msg.get(content, {})) except (json.JSONDecodeError, TypeError): continue if not data.get(success): continue message data.get(message, ) target data.get(target, ) if created in message.lower(): actions.append(message) elif updated in message.lower(): actions.append(message) elif added in message.lower() or (target and add in message.lower()): label Memory if target memory else User profile if target user else target actions.append(f{label} updated) elif Entry added in message: label Memory if target memory else User profile if target user else target actions.append(f{label} updated) elif removed in message.lower() or replaced in message.lower(): label Memory if target memory else User profile if target user else target actions.append(f{label} updated) if actions: summary · .join(dict.fromkeys(actions)) self._safe_print(f{summary}) _bg_cb self.background_review_callback if _bg_cb: try: _bg_cb(f{summary}) except Exception: pass except Exception as e: logger.debug(Background memory/skill review failed: %s, e) finally: # Explicitly close the OpenAI/httpx client so GC doesnt # try to clean it up on a dead asyncio event loop (which # produces Event loop is closed errors in the terminal). if review_agent is not None: client getattr(review_agent, client, None) if client is not None: try: review_agent._close_openai_client( client, reasonbg_review_done, sharedTrue ) review_agent.client None except Exception: pass t threading.Thread(target_run_review, daemonTrue, namebg-review) t.start()技能进化飞轮Skill Nudge - Skill Review - skill_manage(Create/Patch)通过 SKILL_REVIEW_PROMPT 指导 LLM 作出判断后的处理流程File:run_agent.py (L1790-1798)_SKILL_REVIEW_PROMPT ( Review the conversation above and consider saving or updating a skill if appropriate.\n\n Focus on: was a non-trivial approach used to complete a task that required trial and error, or changing course due to experiential findings along the way, or did the user expect or desire a different method or outcome?\n\n If a relevant skill already exists, update it with what you learned. Otherwise, create a new skill if the approach is reusable.\n If nothing is worth saving, just say Nothing to save. and stop. )评估任务复杂性识别需要试错、经验积累或方法调整的非平凡任务检查现有技能判断是否有相关技能已存在决策分支• 有相关技能 → 更新patch• 无相关技能且可重用 → 创建create• 无需操作 → “Nothing to save”LLM 作出判断后根据提示调用 skill_manageLLM 基于提示进行以下判断• 分析对话中是否使用了非平凡的方法• 检查是否经历了试错过程• 评估方法是否具有可重用性• 查询现有技能库判断是否需要更新skill_manage 操作分发skill_manage函数根据 LLM 的决策分发到相应操作•actioncreate→ 调用_create_skill•actionpatch→ 调用_patch_skill关键要点• 整个判断过程由 AI 模型基于提示自主完成没有硬编码规则• LLM 的判断直接影响后续的技能操作类型• 所有操作都包含安全扫描和错误处理机制• 成功操作后会清除系统提示缓存File:tools/skill_manager_tool.py (L56-74)def _security_scan_skill(skill_dir: Path) - Optional[str]: Scan a skill directory after write. Returns error string if blocked, else None. if not _GUARD_AVAILABLE: return None try: result scan_skill(skill_dir, sourceagent-created) allowed, reason should_allow_install(result) if allowed is False: report format_scan_report(result) return fSecurity scan blocked this skill ({reason}):\n{report} if allowed is None: # ask — allow but include the warning so the user sees the findings report format_scan_report(result) logger.warning(Agent-created skill has security findings: %s, reason) # Dont block — return None to allow, but log the warning return None except Exception as e: logger.warning(Security scan failed for %s: %s, skill_dir, e, exc_infoTrue) return NoneFile:tools/skill_manager_tool.py (L574-632)def skill_manage( action: str, name: str, content: str None, category: str None, file_path: str None, file_content: str None, old_string: str None, new_string: str None, replace_all: bool False,) - str: Manage user-created skills. Dispatches to the appropriate action handler. Returns JSON string with results. if action create: if not content: return json.dumps({success: False, error: content is required for create. Provide the full SKILL.md text (frontmatter body).}, ensure_asciiFalse) result _create_skill(name, content, category) elif action edit: if not content: return json.dumps({success: False, error: content is required for edit. Provide the full updated SKILL.md text.}, ensure_asciiFalse) result _edit_skill(name, content) elif action patch: if not old_string: return json.dumps({success: False, error: old_string is required for patch. Provide the text to find.}, ensure_asciiFalse) if new_string is None: return json.dumps({success: False, error: new_string is required for patch. Use empty string to delete matched text.}, ensure_asciiFalse) result _patch_skill(name, old_string, new_string, file_path, replace_all) elif action delete: result _delete_skill(name) elif action write_file: if not file_path: return json.dumps({success: False, error: file_path is required for write_file. Example: references/api-guide.md}, ensure_asciiFalse) if file_content is None: return json.dumps({success: False, error: file_content is required for write_file.}, ensure_asciiFalse) result _write_file(name, file_path, file_content) elif action remove_file: if not file_path: return json.dumps({success: False, error: file_path is required for remove_file.}, ensure_asciiFalse) result _remove_file(name, file_path) else: result {success: False, error: fUnknown action {action}. Use: create, edit, patch, delete, write_file, remove_file} if result.get(success): try: from agent.prompt_builder import clear_skills_system_prompt_cache clear_skills_system_prompt_cache(clear_snapshotTrue) except Exception: pass return json.dumps(result, ensure_asciiFalse)记忆加强飞轮Memory Nudge-Memory Review-memory_tool(add/replace/remove)通过 MEMORY_REVIEW_PROMPT 指导 LLM 作出判断后的处理流程_MEMORY_REVIEW_PROMPT ( Review the conversation above and consider saving to memory if appropriate.\n\n Focus on:\n 1. Has the user revealed things about themselves — their persona, desires, preferences, or personal details worth remembering?\n 2. Has the user expressed expectations about how you should behave, their work style, or ways they want you to operate?\n\n If something stands out, save it using the memory tool. If nothing is worth saving, just say Nothing to save. and stop. )评估用户信息识别用户的个人特质、欲望和偏好检查行为期望检测用户对代理行为和工作风格的期望决策分支• 有用户偏好信息 → 保存到 USER.md• 有重要环境信息 → 保存到 MEMORY.md• 需要更新现有信息 → 使用 replace 操作• 无需操作 → “Nothing to save”LLM 作出判断后根据提示调用 memory_toolLLM 基于提示进行以下判断• 分析对话中用户透露的个人信息• 检查用户表达的行为期望• 评估信息的重要性和持久性• 决定保存目标和操作类型memory_tool 操作分发memory_tool函数根据 LLM 的决策分发到相应操作•actionadd→ 添加新条目•actionreplace→ 更新现有条目•actionremove→ 删除条目关键要点• 整个判断过程由 AI 模型基于_MEMORY_REVIEW_PROMPT的指导做出没有硬编码规则• LLM 的判断直接影响后续的内存操作类型和目标• Memory Nudge 默认每 10 轮对话触发• 内存内容在会话开始时注入系统提示提供持久上下文• 字符限制确保内存保持专注MEMORY.md 2200字符USER.md 1375字符self._memory_nudge_interval 10 self._memory_flush_min_turns 6 self._turns_since_memory 0 self._iters_since_skill 0 if not skip_memory: try: mem_config _agent_cfg.get(memory, {}) self._memory_enabled mem_config.get(memory_enabled, False) self._user_profile_enabled mem_config.get(user_profile_enabled, False) self._memory_nudge_interval int(mem_config.get(nudge_interval, 10)) self._memory_flush_min_turns int(mem_config.get(flush_min_turns, 6)) plaintext def memory_tool( action: str, target: str memory, content: str None, old_text: str None, store: Optional[MemoryStore] None,) - str: Single entry point for the memory tool. Dispatches to MemoryStore methods. Returns JSON string with results. if store is None: return json.dumps({success: False, error: Memory is not available. It may be disabled in config or this environment.}, ensure_asciiFalse) if target not in (memory, user): return json.dumps({success: False, error: fInvalid target {target}. Use memory or user.}, ensure_asciiFalse) if action add: if not content: return json.dumps({success: False, error: Content is required for add action.}, ensure_asciiFalse) result store.add(target, content) elif action replace: if not old_text: return json.dumps({success: False, error: old_text is required for replace action.}, ensure_asciiFalse) if not content: return json.dumps({success: False, error: content is required for replace action.}, ensure_asciiFalse) result store.replace(target, old_text, content) elif action remove: if not old_text: return json.dumps({success: False, error: old_text is required for remove action.}, ensure_asciiFalse) result store.remove(target, old_text) else: return json.dumps({success: False, error: fUnknown action {action}. Use: add, replace, remove}, ensure_asciiFalse) return json.dumps(result, ensure_asciiFalse)从 Hermes 源码里最值得借鉴的 5 个点看完源码后我觉得 Hermes 最值得借鉴的是它把前面这套A Closed Learning Loop做成了一条真正能跑起来的 self-improving 飞轮触发 review - 提炼经验 - 分层写回 - 下一轮重新注入 - 再次执行。最值得借鉴的是这 5 个原则先定义触发条件再谈学习能力。Hermes 不是空谈“系统会自我进化”而是先用Memory Nudge和Skill Nudge决定什么时候进入 review。一个按对话轮次触发一个按工具迭代触发。没有 trigger就没有闭环。review 不是泛泛总结而是定向提炼经验。memory review看的是偏好、行为预期和长期事实skill review看的是非平凡任务、试错过程和可复用方法。它不是“再想一遍”而是有明确提炼目标的结构化复盘。提炼出来的经验必须分层写回。用户画像、环境事实写进MEMORY.md / USER.md可复用的方法写进skill大量历史过程留在session里通过 search 找回。只有分层系统才不会越用越脏。学习链路要后置不要阻塞主任务。Hermes 通过_spawn_background_review在后台异步跑 review agent。主任务先完成学习后置执行写回结果再影响下一轮。这样飞轮能转但不会拖慢主回答。写回之后还得让下一轮真的看得见。这也是为什么 Hermes 不只有写回还做了frozen snapshot、skills index和session search。写回的 memory 要重新进入 promptskill 要能按需加载历史过程也要找得回来。没有“再注入”就不叫 closed learning loop。自己动手如何快速搭建 A closed learning loop如果你想借 Hermes 的思路做自己的 Agent可以先尝试跑通一个最小闭环先建session store。把 message、tool call、错误和修正过程稳定落盘。没有这层后面的 review 和 search 根本没有素材。再拆两类长期资产memory和skill。memory存偏好和事实skill存可复用方法。先把这两个层分开后面系统才不会很快变脏。补一个最小触发器。不需要一开始就做得像 Hermes 那么完整但至少要有一两个明确 hook决定什么时候进入复盘。最简单的做法就是按对话轮次或工具调用次数触发。用一个后台 reviewer 做结构化复盘。它不用负责主任务执行只负责判断三件事要不要新增 memory要不要创建 skill要不要更新旧 skill。这样主链路和学习链路就解耦了。最后补“再注入”这一步。写回的 memory 要能在下一轮 prompt 里重新出现skill 要能通过 index 或按需加载再次参与执行历史过程最好也能 search 回来。做到这一步闭环才算真正闭上。学AI大模型的正确顺序千万不要搞错了2026年AI风口已来各行各业的AI渗透肉眼可见超多公司要么转型做AI相关产品要么高薪挖AI技术人才机遇直接摆在眼前有往AI方向发展或者本身有后端编程基础的朋友直接冲AI大模型应用开发转岗超合适就算暂时不打算转岗了解大模型、RAG、Prompt、Agent这些热门概念能上手做简单项目也绝对是求职加分王给大家整理了超全最新的AI大模型应用开发学习清单和资料手把手帮你快速入门学习路线:✅大模型基础认知—大模型核心原理、发展历程、主流模型GPT、文心一言等特点解析✅核心技术模块—RAG检索增强生成、Prompt工程实战、Agent智能体开发逻辑✅开发基础能力—Python进阶、API接口调用、大模型开发框架LangChain等实操✅应用场景开发—智能问答系统、企业知识库、AIGC内容生成工具、行业定制化大模型应用✅项目落地流程—需求拆解、技术选型、模型调优、测试上线、运维迭代✅面试求职冲刺—岗位JD解析、简历AI项目包装、高频面试题汇总、模拟面经以上6大模块看似清晰好上手实则每个部分都有扎实的核心内容需要吃透我把大模型的学习全流程已经整理好了抓住AI时代风口轻松解锁职业新可能希望大家都能把握机遇实现薪资/职业跃迁这份完整版的大模型 AI 学习资料已经上传CSDN朋友们如果需要可以微信扫描下方CSDN官方认证二维码免费领取【保证100%免费】