
•
How to make your agents prove your product works
Unit tests tell me whether a piece of code behaves. Behavioral agent tests tell me whether a real workflow still makes sense when a person has to use it.

•
How to give agents design taste
I’m testing Pailflow waitlist pages with agents and learning that the hard part is not generating pages. It is giving the agent taste.

•
The Building Blocks of AI Agent Systems
I simplified my stack into clear building blocks so commands, skills, subagents, question gates, and review inboxes work as one system instead of disconnected prompts.

•
How to eval agent skills
I stopped manually spot-checking skill edits and moved to a fixed eval loop. The result was faster revisions, clearer failures, and more trustworthy agent behavior.
.png?table=block&id=9ba33ac6-8e12-48f6-b980-4333b612ec56&cache=v2)
.png?table=block&id=3204768c-ac2d-81f2-8d9e-ec2d3d636802&cache=v2)
