open ai harness engineering ——
A super interesting blog post about what the future of fully agent-driven software might look like. It’s essentially a breakdown of what an AI-pilled high throughput coding system might look like. Where the role of being an engineer is less about the code being written in the system and more about scaffolding the system to make it extremely amenable to agents to use. Like the scaffold to create software that you want. Some of the things that were extremely interesting and stuff I had not seen:
- They use the file system and mark downs heavily to actually instruct how the agent should do things.
- They have an agents.md which acts as a map, they have things like a large architecture diagram, design docs for design principles used to make the software, executive plans for super large changes that are being made, and references for the model to use.
- With the specific goal of enabling progressive disclosure, which is a way for allowing the agents to have a small entry point that is stable where they can learn to look where they need to go. Another thing that was interesting was the use of linters to enforce a lot of rules pretty aggressively. They introduced a lot of invariants specifically around establishing strict boundaries and making things very structurally sound. They used custom linters to end structural tests to enforce these. And this is specifically important because these constraints prevent drift over time. And this is specifically important because patterns are quick to be encoded in agents. So if you do it in one place it can very quickly happen in a lot of places.
- The merge philosophy is unique as well. They have very minimal gates for actually blocking merge and the tests that they run are a lot less like they’re run but not necessarily like enforced. Here specifically it’s more important to have high throughput than it is to have high correctness because in a high throughput environment correctness is less important and correctness needs to be more and like high throughput needs more value being delivered.
- The final thing that I found was interesting is the use of garbage collection and the need for it. Essentially AI slop, even in these constraints, is still very common. And so they need some way to mechanically go about cleaning up the AI slop. One way that they do this is by routine AI tech days where they send agents to clean up some of the tech debt and specifically focus on reviewing PRs in this way.
Super duper fascinating!
"harness engineer"
the future of software
until codex six
2026-02-13
2026-02-13