.png?table=block&id=2524768c-ac2d-817a-8cd6-df7d6482193c&cache=v2)
Do not index
Do not index
Spinning up an AI agent today is easy. Between no-code tools, vibe-coded scripts, and custom agent frameworks, I can get something working in a few hours that more or less does what I want it to do.
But I’ve learned that getting an agent to do something once is not the same as getting it to do it well — and definitely not the same as getting it to do it reliably. That’s the difference between a toy and a tool.
Over the past few months, I’ve been on a hunt to improve how my agents perform. What inputs work best? How do I reduce ambiguity? How do I make outputs consistent?
One of the biggest unlocks has been structure — specifically, using XML to format the input. It forces clarity, reduces surprises, and makes it way easier to scale agents into real workflows.
In this post, I’ll share what I’ve learned, how I use XML in my own stack, and why this approach has made a real difference.
1. Structure forces clarity
When I write instructions for an agent, it’s easy to fall into the habit of writing for a person. That usually means extra words, indirect phrasing, and unnecessary context. It works fine when you're talking to a human. But AI agents don’t need style or emphasis. They need precision.
XML solves this by forcing me to write like code. I define what I want, and nothing else. That constraint makes the instructions more concise, more explicit, and more useful. I can’t ramble in XML. I have to think clearly about what the agent needs and only include that.
It’s less like giving advice and more like calling a function.
This shift in mindset, from narrative to structured input, has not only reduced input tokens but also increased the amount of relevant context my agents take in. It works because it forces me to be clear.
2. I use XML to structure model instructions
I still use natural language for user input. But when it comes to how the model behaves, I’ve started structuring those instructions using XML.
This lets me write instructions that are readable by the model, easy for me to update, and scoped clearly to what I want the agent to do. Instead of a vague prompt like “You’re an AI agent that helps with dev workflows,” I now describe the role, goals, routing rules, expected steps, context files, and response formats — all in structured XML.
Here’s a real example from one of the PailSwarm agents:
<role>Development workflow orchestrator that transforms feature requests into actionable development tasks</role>
<goal>Generate PRDs on demand and create GitHub pull requests with intelligent repository selection and context loading</goal>
<instructions>
<step id="1">Identify request type using intent-based routing rules</step>
<step id="2">Load relevant context using LoadContextTool.run(agent="PailCode", context_type="[context_file]")</step>
<step id="3">Execute workflow and return results to PailAssist</step>
</instructions>
<routingRules>
<rule intent="create pull request (no issue/backlog mentioned)" context="on_demand_prd" agent="PailCode"/>
<rule intent="create pull request from issue or backlog" context="issue_pull_prd" agent="PailCode"/>
</routingRules>
This is a much more explicit instruction set than a traditional system prompt. It forces me to clarify what the agent is for, how it should act, and how it should handle specific situations.
3. XML makes it easier to debug agent behavior
One of the hardest parts of working with large text prompts is debugging. If an agent does something unexpected, it's hard to know why. Was the prompt unclear? Was the example too vague? Did it interpret the instruction differently than I intended?
With XML, I’ve been able to get much more visibility into what’s going wrong.
Because each instruction is broken out — by role, goal, step, routing rule, and more — I can trace issues back to a specific section of the instruction. If the agent returns the wrong output, I can usually match it to a routing rule that didn’t fire, or a step that was too long or unclear.
It also makes experimentation easier. If I want to test a different approach, I don’t have to rewrite a paragraph of prose. I just swap out a single tag or value and run it again.
Instead of treating the prompt like a magic spell, I get to treat it like a real config file. That shift has made building and maintaining agents much less painful.
Conclusion
If your agents feel unpredictable or fragile, the issue might not be with the model. It might be the input.
XML structures give you a way to be explicit, reduce ambiguity, and build more dependable systems. At PailSwarm, using structured XML inputs helped me scale from simple demos to complex, production-ready agent workflows. And the more structure I added, the better things worked.
If you're trying to move from experiments to real systems, start with the inputs. Define a schema. Add validation. Treat your agents like APIs.
Want more posts like this? Subscribe to my newsletter: https://thebuildingblocks.substack.com/