Agentic Harness Engineering

Paul Iusztin

Mar 31

Building systems that transform the LLM into the new operating system

Read →

12 Comments

ToxSec

Apr 1

really timely article. it’s super interesting to see Anthropics approach’s to harnessing with this recent leak.

looks like a lot of people suddenly became more aware of these techniques. i think we are going to see more attention in this area, and articles like this are super useful.

Reply (1)

Paul Iusztin

Apr 2

thanks man! More similar articles incoming

richardstevenhack

Mar 31

Re sandboxes. Do remember that AIs aren't bad at escaping sandboxes. They've done it before.

And since agents are inherently unreliable, deterministic procedures must be in place to control and monitor them as part of - perhaps the major part of - the harness engineering.

Reply (1)

Paul Iusztin

Apr 2

yes, exactly! Also, sandboxes have different levels. If you create a VM as a sandbox that's impossible to escape, if you create a sandbox as a Python process, well...

But this is still an open question, what is the best way to engineer this

Reply (1)

richardstevenhack

Apr 2

VMs can be escaped.

AI Agents Escaping Containers: What the Latest Research Means For Businesses

https://www.purpleshieldsecurity.com/post/ai-agents-container-breakout-risks

Never underestimate what a determined and creative AI agent can do. They will undertake multiple steps to get to their goal - steps you can not predict precisely because they are probabilistic.

People also need to remember that an agent running on a machine can see and potentially affect and use everything on that machine, whether that machine is the host or a VM.

It's the classic basic computer security line: If an attacker has access to your machine, it's no longer your machine.

Reply (1)

Paul Iusztin

Apr 3

I didn't know that man! But now that you highlighted it, it makes a lot of sense. Ultimately, it's our job to put the right guardrails in place to control this behavior.

Do you trust your agents after reading this?

Reply (1)

richardstevenhack

Apr 3

As it happens, at this junction I don't have any agents. I stayed away from OpenClaw as soon as I heard what a fiasco that was.

Eventually I'll use agents, but only on a separate machine from my main machines (or a VPS) after clarifying the exact ways to lock them down. There are plenty of instructional videos on that these days.

On the corporate level, the issue is much harder.