The Agent Harness Belongs Outside the Sandbox
In the ever-evolving landscape of AI and machine learning, where does the agent harness fit? This critical loop, which drives large language models (LLMs), can either run inside or outside the sandbox. The choice has profound implications for security, performance, and scalability, especially in multi-user environments. For engineers and product managers, understanding these dynamics is crucial as they build and deploy AI tools.
Inside vs. Outside: Two Architectures
Running the harness inside the sandbox means everything happens within a single container. This setup is straightforward and works well for single-user applications. It leverages local filesystems, making it easy to use existing harnesses without modification. However, this simplicity comes at a cost. Credentials are exposed, and the sandbox can’t be suspended, leading to inefficiencies and potential security risks.
In contrast, running the harness outside the sandbox offers enhanced security and flexibility. Credentials remain secure on the backend, and sandboxes can be provisioned on demand, reducing idle resource usage. This architecture also transforms sandboxes into interchangeable components, allowing for seamless session recovery if one fails. For organizations with multiple engineers, this model simplifies shared access to skills and memories, avoiding the complexities of a distributed filesystem.
Tradeoffs and Real Implications
Choosing to run the harness outside the sandbox isn’t without its challenges. Off-the-shelf harnesses, designed for local filesystems, require adaptation. Durable execution becomes a priority, as agent sessions must persist through updates and failures. Solutions like Inngest can help, enabling long-running functions that resume seamlessly after disruptions.
Moreover, managing the sandbox lifecycle efficiently is crucial. Tools like Blaxel offer fast resume times, ensuring the sandbox is only active when needed. This approach minimizes cold start delays, crucial for maintaining interactive performance.
For engineers, this means rethinking how they structure and manage agent sessions. The traditional reliance on local filesystems must give way to database-backed solutions, ensuring shared organizational knowledge and skills are accessible across sessions.
What to Watch Next
For founders and investors, the shift towards running agent harnesses outside the sandbox signals a broader trend towards scalable, secure AI deployments. As AI continues to integrate into organizational workflows, the ability to manage and share resources efficiently will be key.
Engineers should keep an eye on new patterns and capabilities emerging in the space. The rapid pace of development means staying informed about the latest tools and architectures is essential. Understanding these dynamics not only enhances current projects but positions teams to leverage AI effectively in the future.
Ultimately, the choice of where to run the agent harness is more than a technical decision—it’s a strategic one that impacts security, efficiency, and collaboration. As the industry continues to evolve, those who master this balance will lead the way in deploying AI that truly adds value.




















