Из ленты dev.to devops — кратко, чтобы не потерять.
TL;DR: Everyone logs tool calls that error or return junk. We started logging the calls our own validation REJECTED before they ever ran. Over a month, about 1 in 3 of those rejections were false: a valid user intent our schema or precheck was too rigid to accept. We had spent weeks hardening the guardrail and never checked whether it was now blocking real work. The blind spot in “we added validation” After an incident where our agent made a structurally valid but wrong tool call, we added a precheck layer in front of every state-mutating tool. Failures dropped, we moved on. What we did not log was the other side of the ledger: every time the precheck said no. A block felt like a success by definition. The agent tried something bad, we stopped it, good. Then support started forwarding tick
Полный текст и контекст у первоисточника: https://dev.to/james_oconnor_dev/we-logged-every-rejected-tool-call-for-a-month-a-third-were-our-validation-being-wrong-not-the-3nm1