OpenClaw setup
Dedicated Mac, controlled tools, its own logins, narrower worker agents.
A short talk about moving from one-off prompting to systems that can carry context, route work, and support human judgment in real environments.
I’m Matt Yilmaz, a UT Government senior. This class pushed me from one-off prompting toward workflows that could actually carry work.
I was not trying to build a sci-fi agent. I was building an operating layer.
Reference clip Videos like this made it feel sci-fi. Functionally, this was the direction.
One operating layer, multiple workflows.
Dedicated machine, bounded tools, worker agents, and enough memory to carry work across sessions.
Dedicated Mac, controlled tools, its own logins, narrower worker agents.
Persistent context, background work, and the operating layer behind the CRM and research systems in the rest of the talk.
That difference mattered for the rest of this talk. Claude Code was excellent for the direct coding loop we learned in class. OpenClaw mattered more for me when the agent needed to live across channels, tools, memory, and background work.
One is great when the work stays inside a coding session. The other becomes useful when the agent has to live outside one window.
Great for writing, editing, debugging, and shipping inside a focused coding loop.
More polished out of the box, which made it ideal for classroom use and direct coding work.
The same agent could help in private chat, show up in group threads, and stay available outside a coding window.
That made it much better for multi-step automation and the system-building work behind the examples in this talk.
The old workflow was Zendesk plus a lot of manual reconstruction. Tickets were organized, but the actual intelligence still lived outside the workflow.
Volume made the small gaps hurt.
Once the order count is real, every manual step in support and escalation starts compounding.
The trick was not one big agent. It was a small operating layer with clear roles and clean handoffs.
Sort first and escalate cleanly.
Pull the useful fragments together.
Draft from live context and policy memory.
Measure the flow so it improves.
Route, assemble, draft, and measure inside one workflow.
The biggest gain came from tightening the workflow: route first, assemble context, draft in the thread, and make the system measurable.
Sort the queue, surface urgency, and send tickets where they belong before asking the model to write anything.
Generation is most useful after context is assembled, not as a substitute for it.
VA triage, escalation context, AI reply.
Not a prettier inbox. A workflow that could route, surface context, and draft inside the thread.
The default TLO search was fine for lookup, but weak for real research. 89search made it easier to search by topic, filter by member status, and stay close to the source material.
Structured search beats blind keyword hunting.
Topic filters, member context, and visible sources made research faster than basic TLO search.
The useful version of this story is not model magic. It is system design under real constraints.
A narrow system with a real job usually beats a general system that claims it can do everything.
A lot of failures come from bad retrieval, weak constraints, or the wrong tool boundary, not from weak prompting style.
Approval, escalation, and ambiguous edge cases should stay legible and easy to interrupt.
The highest-value systems do not just write text faster. They reduce handoff loss, tab hopping, and repeated explanation.
Useful systems get designed, not merely prompted.
The win came from boundaries, context, review, and feedback loops, not from pretending the model was magic.
Thanks for your time.
Scan to keep up with the work.