The Webhook That Blocked Itself
Here’s a failure mode that happens predictably, in every sufficiently complex distributed system, once the security layer gets sophisticated enough.
You write an admission webhook — a policy enforcement point that intercepts every API call to your Kubernetes cluster and decides whether to allow it. It validates that pods have resource limits. It rejects images from untrusted registries. It enforces namespace labels. You’re proud of it. It works.
Then the pod running your webhook needs to restart. The cluster tries to schedule a new pod for it. The webhook intercepts the create request. The webhook policy checks whether the pod is allowed. The webhook pod doesn’t exist yet to answer. The cluster waits. Nothing moves.
You’ve built the lock and left the key inside the room.1
Why This Happens¶
The problem is a category error in the data model.
Your webhook is a Kubernetes resource — a Pod, a Deployment, a Service. The things your webhook enforces rules on are also Kubernetes resources. They live in the same namespace, go through the same API server, are subject to the same scheduling system. At the data model level, your enforcement mechanism is indistinguishable from the things it’s enforcing.
So when your webhook intercepts a Pod creation request, it has no structural way to distinguish “this is the pod that is the enforcement mechanism” from “this is the pod that the enforcement mechanism should check.” The enforcement mechanism can see itself in the registry. And when it tries to apply its own rules to itself, the recursion closes.
The official Kubernetes documentation calls this a “dependency loop” and the recommended fix is a namespaceSelector in your webhook configuration that excludes the namespace your webhook lives in.2 Simple. Pragmatic. But once you understand the deeper shape of the problem, you realize the exemption list is more interesting than the webhook itself.
The Exemption List Tells You What You’re Trusting¶
The Kubernetes documentation doesn’t just tell you to exclude your own namespace. It tells you to exclude kube-system, kube-public, and kube-node-lease.3 Always. Without exception.
Why? Because kube-system contains CoreDNS, kube-proxy, the CNI networking plugin, and other components that the rest of the cluster — including your webhook — depends on to function. If your webhook intercepts and rejects a CoreDNS restart, you’ve lost DNS. No DNS means your webhook can’t resolve external dependencies. No DNS means your admission webhook can’t do the outbound lookup it needs to validate a policy. The webhook has cut off the branch it’s sitting on.
The exemption list isn’t just “things the enforcer needs to skip to avoid blocking itself.” It’s the full set of things the enforcement mechanism depends on to exist. The boundary of the exemption is a map of the trust substrate. If you exclude kube-system, you’re saying: everything in kube-system is beneath the enforcement layer. It has to be, or the enforcement layer can’t run.
Microsoft Azure’s Kubernetes Service took this to its logical conclusion by building an Admissions Enforcer — a system that automatically applies the correct namespace exemptions to every custom admission webhook deployed in the cluster.4 They had to. Left to individual webhook authors to manage their own exemption selectors, the pattern breaks constantly in predictable ways. So AKS built a central policy that enforces the exemption of all other policies.
The Admissions Enforcer is, of course, exempt from itself.
The Clean Solution: Keep the Enforcer Outside the Model¶
When a system gets this right, the enforcement mechanism doesn’t live in the data model at all. The bypass isn’t an exemption entry — it’s a different layer of the stack.
Linux root access is the canonical example. When a process running as uid=0 tries to read a file it doesn’t own, does Linux check the file’s permission bits, find a special “root can bypass this” entry, and proceed? No. There is no such entry. The filesystem doesn’t know root exists.
The bypass happens in generic_permission() in the VFS layer of the kernel — code that runs before filesystem permission bits are consulted.5 If the process has CAP_DAC_OVERRIDE, the permission check returns success without touching the inode at all. There’s no “root” row in the file’s access control metadata. The capability check is kernel code in a different layer, not an entry in the thing being protected.
This is what Saltzer and Schroeder called complete mediation in their 1975 paper on secure systems design: every access to every object must be checked through the authorization mechanism.6 The corollary is that the authorization mechanism itself cannot be subject to the checks it performs — otherwise you need a meta-mechanism to authorize the authorizer, and a meta-meta-mechanism to authorize that, and so on. The recursion has to terminate somewhere, and where it terminates is the boundary between your enforcement layer and whatever you’re trusting without further verification.
For Linux, that boundary is the kernel itself. Kernel code is trusted by definition — it runs in ring 0, the hardware trust root. The capabilities check is part of the kernel; filesystem permission bits are data the kernel reads. There’s no confusion between the two levels because they are literally different processor privilege rings.
In distributed systems, you rarely have that luxury. Everything is the same ring. Everything is software. Everything goes through the same API.
When You Can’t Avoid It, Know What You’re Accepting¶
The exemption-based approach isn’t wrong. It’s often the only option available. But the exemption is not a solved problem — it’s a managed one.
Kubernetes has system:masters, a group that bypasses all RBAC evaluation entirely. The official security documentation is explicit: if a user is in system:masters, their permissions cannot be revoked by removing role bindings.7 This is necessary because during bootstrapping, someone has to be able to administer the cluster before the RBAC system is configured. But it means a cluster’s RBAC model has a named entity — system:masters — that is in the authorization system but does not go through the authorization system.
AWS has the same shape at the account level. The root user for an AWS account bypasses IAM policy evaluation entirely — you cannot attach an IAM policy to the root user to restrict what it can do.8 IAM doesn’t govern the root user because IAM is a service that the root user created. IAM can’t authorize the entity that authorizes IAM.
In each of these cases, the “exemption” isn’t an oversight. It’s the enforcement mechanism admitting that it has a foundation it didn’t build and can’t verify. The RBAC system rests on system:masters. IAM rests on the root account. Your admission webhook rests on kube-system. None of those foundations go through the authorization layer above them.
What matters is whether you’ve made that admission consciously. The exemption list is a statement of trust. Leaving kube-system out of your webhook’s scope isn’t sloppy configuration — it’s acknowledging that your enforcement layer has a substrate, and the substrate is outside your enforcement layer’s reach.
The dangerous version isn’t the deliberate exemption. It’s the accidental one — the namespace that slipped through a matchLabels selector, the IAM policy that was attached to a role instead of the user, the webhook that only runs on CREATE but not UPDATE. Those are enforcer bypasses that don’t know they’re exemptions. They don’t say “this is trusted without verification.” They just fail silently.
If you’re going to have exceptions to your enforcement mechanism — and you are, because the enforcement mechanism has to stand on something — make them explicit, make them documented, and make the exemption list small enough that you can read it in one sitting.
That list is your trust model. Treat it like one.
-
This failure mode is documented in the Kubernetes official documentation. See: “Admission Webhooks: Good Practices”, Kubernetes Documentation. “Dependency loops can occur in scenarios like the following: Your webhook intercepts cluster add-on components… that your webhook depends on.” ↩
-
Kubernetes Documentation, “Admission Webhooks: Good Practices”. The recommended fix:
namespaceSelectorwithmatchExpressionsexcludingkube-system,kube-public, and the webhook’s own namespace. ↩ -
Same source. “A critical best practice is to exclude system namespaces (
kube-system,kube-public,kube-node-lease) from your webhooks.” ↩ -
Microsoft Azure AKS documentation describes the Admissions Enforcer: “To protect the stability of the system… AKS has an Admissions Enforcer, which automatically excludes kube-system and AKS internal namespaces” from custom admission controllers. See AKS admission controllers documentation. ↩
-
Linux
man7.org, capabilities(7): “Privileged processes bypass all kernel permission checks.” The bypass is implemented viaCAP_DAC_OVERRIDEingeneric_permission()infs/namei.c— a conditional path in the VFS layer, not an entry in inode permission bits. Since Linux 2.2, root access is capability-mediated, meaning root processes with dropped capabilities lose the bypass, and non-root processes withCAP_DAC_OVERRIDEgain it. ↩ -
Jerome H. Saltzer and Michael D. Schroeder, “The Protection of Information in Computer Systems”, Communications of the ACM, 1975. Complete mediation is one of eight design principles: “Every access to every object must be checked for authority.” ↩
-
Kubernetes Documentation, “RBAC Good Practices”: “Avoid adding users to the
system:mastersgroup. Any user who is a member of this group bypasses all RBAC rights checks and will always have unrestricted superuser access, which cannot be revoked by removing RoleBindings or ClusterRoleBindings.” ↩ -
AWS IAM Documentation, “Policy Evaluation Logic”: “By default, all requests are implicitly denied with the exception of the AWS account root user, which has full access.” Root is not an IAM principal that can be restricted by identity-based IAM policies — it precedes IAM in the account’s authority hierarchy. ↩