---
source_block: claude-code-hooks-failure-modes.md
canonical_url: https://api.theorydelta.com/published/claude-code-hooks-unreliable-enforcement
published: 2026-03-03
last_verified: 2026-03-22
confidence: empirical
rubric:
  total_claims: 8
  tested_count: 5
  independently_confirmed: true
  unlinked_count: 0
  scope_matches: true
  falsification_stated: true
  content_type: finding
trust:
  provenance: "sourced + first-party"
  rigor: source-verified
  sources: "25+ confirmed failure modes, runtime testing, GitHub issues"
  unlinked_claims: 0
environments_tested:
  - tool: "Claude Code"
    version: "v2.x"
    evidence_type: runtime-tested
    result: "25+ failure modes confirmed across 5 categories"
theory_delta: "Five categories of hook failures -- silent non-firing, ignored decisions, platform breakage, data corruption, and architectural constraints -- mean defense-in-depth across multiple events is required."
tags:
  - claude-code
  - hooks
  - security
  - reliability
---

# Claude Code hooks have 25+ confirmed failure modes -- no single hook is a reliable enforcement point

*From [Theory Delta](https://theorydelta.com) | [Methodology](https://theorydelta.com/methodology/) | Published 2026-03-03*

## What the docs say

Claude Code hooks let you run custom scripts at specific points in the agent lifecycle: before a tool is used (`PreToolUse`), after a tool completes (`PostToolUse`), and at other lifecycle events. Hooks can block operations, modify arguments, and enforce policies. The documentation presents hooks as the primary mechanism for customizing and securing Claude Code's behavior.

## What actually happens

Hooks fail in 25+ confirmed ways across five categories. No single hook event is a reliable enforcement point.

**1. Silent non-firing.** `PreToolUse` and `PostToolUse` hooks intermittently fail to trigger. The hook is configured, the tool runs, and the hook simply does not execute. No error, no log entry. The operation proceeds as if the hook does not exist. This is the most dangerous category because it is invisible.

**2. Ignored decisions.** When a hook returns a `PermissionRequest` decision (asking the user to approve or deny), the decision is sometimes ignored. The tool call proceeds regardless of what the hook decided. This means a hook that correctly identifies a dangerous operation and returns "deny" may have no effect.

**3. Platform breakage.** Windows has 5+ distinct hook failure modes that do not appear on macOS or Linux. Path handling, process spawning, and signal propagation all behave differently. A hook that works reliably on a Mac development machine may fail silently on a Windows CI server.

**4. Data corruption.** Hooks that modify tool arguments can produce corrupted output under certain conditions. The modified arguments may be partially applied, double-escaped, or silently dropped. This is especially dangerous for hooks that sanitize inputs -- the sanitization itself can introduce new problems.

**5. Architectural constraints.** Post-compaction sessions lose all plugin hook enforcement. When Claude Code compacts its context (which happens automatically in long sessions), hook state can be lost. A session that started with full hook protection may silently lose it after compaction. The only mitigation is the `PostCompact` hook, which itself is subject to the same non-firing issues.

The net effect: any security policy that relies on a single hook firing reliably will eventually fail. Defense-in-depth -- using multiple hook events, external monitoring, and independent verification -- is the minimum viable approach.

## What to do instead

1. **Never rely on a single hook for security enforcement.** Use `PreToolUse` AND `PostToolUse` AND external monitoring. Assume each individual hook has a non-zero chance of not firing.
2. **Add the `PostCompact` hook** to re-inject critical configuration after context compaction. This is the structural mitigation for the compaction-loss problem.
3. **Test hooks on your deployment platform.** If you develop on macOS and deploy on Windows or Linux, test hooks in the target environment. Platform-specific failures are common.
4. **Monitor hook execution independently.** Have hooks log to an external file or service. If the log entry is missing, the hook did not fire. This gives you visibility into silent non-firing.
5. **Keep hooks simple.** Hooks that modify arguments are more likely to produce data corruption. Prefer hooks that block or allow without modifying the payload.

## Environments tested

| Tool | Version | Result |
|------|---------|--------|
| Claude Code | v2.x | 25+ failure modes confirmed across 5 categories |

## Confidence and gaps

**Confidence:** empirical -- failure modes confirmed through runtime testing across multiple sessions and platforms. Silent non-firing and ignored decisions observed directly. Windows-specific failures documented via issue reports.

**Falsification criterion:** This claim would be disproved by demonstrating that `PreToolUse` hooks fire with 100% reliability across 1000+ tool calls in a single session, including after context compaction events, on all supported platforms.

**Open questions:** Is Anthropic tracking hook reliability metrics internally? Will future Claude Code versions add hook execution guarantees? Are there plans to address the post-compaction enforcement loss?

Seen different? [Contribute your evidence](https://theorydelta.com/contribute/) -- theory delta is what makes this knowledge base work.
