A practical approach to working with AI agents —
using clear workflows, agent definition files, and step-by-step reviews
The model writes code using functions or libraries that do not exist. It sounds confident, but the result does not work.
The same question gives a different answer in each session. You cannot rely on getting the same code structure or naming.
After many messages, the model starts to ignore your original rules — the coding style, the approved libraries, the project setup.
Every new session starts from zero. The model has no idea what you decided before — past choices, rejected ideas, team rules.
Session start
You give the agent its rules: which libraries to use, how to name things, what to avoid. It follows them well at first.
Mid session
As the chat grows, the model's attention spreads across all the messages. The early rules start to get less weight.
End of session
The model breaks your naming rules, adds libraries you did not ask for, and forgets the decisions you made two hours ago.
❶ No Long-Term Memory
There is no built-in way to carry over what was decided. You have to re-explain everything from the start.
❷ Context Window Problem
A single long system prompt tries to cover everything — and covers nothing well. The model gets confused about its own role.
❸ No Review Step
Without a review step, wrong code gets merged. Problems are found late — or not at all.
① Existing Agent (Trail of Bits)
github.com/trailofbits/skills
└─ plugins/differential-review
✓ Published security methodology
✓ No setup needed — just reference it
✓ Differential security review pattern
✓ Trusted by security researchers
no custom agent file needed
② Prompt Given
"Use the security audit approach from github.com/trailofbits/skills
…/differential-review to review this Scala push notification API. Read all source files. Find vulnerabilities. For each issue, show the current code, the fixed code, and steps to verify the fix. Group by severity. Mark anything that needs a human to fix manually."
→ agent reads codebase autonomously
③ Output — SECURITY_FIXES_EN.md
Before — Everything in one request
After — Queue-based, separate worker
A separate agent for each job — architecture, coding, security — works much better than trying to do everything in one chat session.
A short definition file gives each agent a clear role and rules. It carries over between sessions so you do not have to start from scratch.
The agent brings up the choices and waits. It does not pick technologies or make design decisions on its own — that is the developer's job.
When the task order is clear, agents can work on separate pieces at once. This cuts down the total time without losing quality.