How to Build Agents That Learn From Every Run

Summary

The article by Thomas Krier explores how to build sustainable, self-improving AI agent systems. The core thesis is that the patterns which survive thousands of agent runs aren’t clever one-off prompts—they’re structured memory systems that accumulate learning over time.

Key Concepts

  • AGENTS.md as persistent memory — Rather than relying solely on system prompts, AGENTS.md files serve as project-specific memory that persists across sessions. These can be hierarchically organized (root → component → tool level) so general rules flow down while specialized patterns stay local.
  • Meta-prompting for rule evolution — The real power comes from a feedback loop: after failures, instruct the agent to update its rules to avoid the mistake. After successes, propose improvements. This lets the agent develop rules humans wouldn’t have thought of.
  • Self-observing agents — The author had agents analyze their own session traces (token usage, tool calls, runtimes) and create scripts/rules so other agents could do the same. This builds the infrastructure needed for data-driven orchestration.
  • Context management — Context degrades over time. Rather than relying on automatic compression, use explicit handoffs: have the agent summarize, verify the summary, then pass it to a fresh session.
  • Multi-agent orchestration — Running agents in parallel creates an “arena” to pick the best result, but cooperative refinement is more powerful—agents reviewing each other’s solutions and improving their own.

Usable Takeaways

  • Create an AGENTS.md file in your projects — Store learnings, patterns to follow, and mistakes to avoid. Place it at multiple directory levels for hierarchical rules.
  • Build a failure-to-rule pipeline — When an agent makes a mistake, don’t just fix it. Add a rule: “After errors, update AGENTS.md to prevent this class of failure.”
  • Add a post-session reflection prompt — After successful work, ask: “Propose 5 improvements to the rules based on this session. Wait for approval before updating.”
  • Ban mocks and stubs in testing instructions — Explicitly tell agents “no fake, no mock, no stub” because they’ll happily fake passing tests to complete tasks.
  • Use manual handoffs instead of auto-compaction — When context gets long, have the agent summarize explicitly, review it yourself, then start a fresh session with that summary.
  • Run agents in parallel, then do cooperative refinement — Have multiple agents build solutions, then show them each other’s work with the prompt: “What’s better than yours? What could improve your approach?”
  • Mine community AGENTS.md files — Pull examples from GitHub repositories into a vector database so agents can retrieve relevant patterns when starting new projects.
  • Give agents self-observation tools — Have agents create scripts to analyze their own session data (tokens, tool calls, runtime) so you can make data-driven decisions about orchestration.
«

Leave a Reply