Oversight you can't measure is decoration — The Log

This month we published The Human Layer Audit (DOI: 10.5281/zenodo.19453026), and it is built around an uncomfortable finding.

Human oversight, as practiced in most AI deployments today, is theater. A human "reviews" the AI's output, the review takes three seconds, the approval rate is effectively 100%, and an accountability box gets ticked. This is worse than no oversight at all, because it manufactures the appearance of control while providing none of its function. Regulators are starting to notice the difference. Organizations mostly are not.

The Audit's answer is a maturity model: five components of human oversight, five levels of maturity for each, scored against risk tiers that define the minimum acceptable standard depending on what the AI system actually does: advise, collaborate, or act with consequences. The point of the model is not the score. The point is that "we have a human in the loop" stops being a sentence anyone can say without evidence.

The paper exists because every serious conversation we have had this year ended at the same wall. Executives agreed the human layer mattered, then asked: how do we know if ours is any good? Without an instrument, the honest answer was "you don't." Now there is one.

One more thing the Audit forced on us: it applies to Timer too. We are building AI systems that act inside organizations, which means our own oversight architecture gets scored by our own rubric, and published scores are the plan, not a hypothetical. A measuring stick you exempt yourself from is decoration of a different kind.

Oversight you can't measure is decoration.

The future belongs to organizations that remember.