本文发表于一年多前。旧文章可能包含过时内容。请检查页面中的信息自发布以来是否已变得不正确。
Kubernetes 1.30:Validating Admission Policy 已正式发布
我很高兴代表 Kubernetes 项目宣布,作为 Kubernetes 1.30 版本的一部分,ValidatingAdmissionPolicy 已达到 正式发布(GA) 阶段。如果你还没有了解过这个新的、用于替代验证性准入 Webhook 的声明式方案,你可能会有兴趣阅读我们之前关于这项新功能的博文。如果你已经听说过 ValidatingAdmissionPolicies 并渴望尝试,现在就是最佳时机。
让我们通过替换一个简单的 Webhook 来体验一下 ValidatingAdmissionPolicy。
准入 Webhook 示例
首先,让我们看一个简单的 Webhook 示例。下面是一个 Webhook 的摘录,它强制将 runAsNonRoot
、readOnlyRootFilesystem
、allowPrivilegeEscalation
和 privileged
设置为最低权限值。
func verifyDeployment(deploy *appsv1.Deployment) error {
var errs []error
for i, c := range deploy.Spec.Template.Spec.Containers {
if c.Name == "" {
return fmt.Errorf("container %d has no name", i)
}
if c.SecurityContext == nil {
errs = append(errs, fmt.Errorf("container %q does not have SecurityContext", c.Name))
}
if c.SecurityContext.RunAsNonRoot == nil || !*c.SecurityContext.RunAsNonRoot {
errs = append(errs, fmt.Errorf("container %q must set RunAsNonRoot to true in its SecurityContext", c.Name))
}
if c.SecurityContext.ReadOnlyRootFilesystem == nil || !*c.SecurityContext.ReadOnlyRootFilesystem {
errs = append(errs, fmt.Errorf("container %q must set ReadOnlyRootFilesystem to true in its SecurityContext", c.Name))
}
if c.SecurityContext.AllowPrivilegeEscalation != nil && *c.SecurityContext.AllowPrivilegeEscalation {
errs = append(errs, fmt.Errorf("container %q must NOT set AllowPrivilegeEscalation to true in its SecurityContext", c.Name))
}
if c.SecurityContext.Privileged != nil && *c.SecurityContext.Privileged {
errs = append(errs, fmt.Errorf("container %q must NOT set Privileged to true in its SecurityContext", c.Name))
}
}
return errors.NewAggregate(errs)
}
查看什么是准入 Webhook?或者,查看此 Webhook 的完整代码以便跟随本演练。
策略
现在,让我们尝试使用 ValidatingAdmissionPolicy 来忠实地重现这个验证逻辑。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: "pod-security.policy.example.com"
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: object.spec.template.spec.containers.all(c, has(c.securityContext) && has(c.securityContext.runAsNonRoot) && c.securityContext.runAsNonRoot)
message: 'all containers must set runAsNonRoot to true'
- expression: object.spec.template.spec.containers.all(c, has(c.securityContext) && has(c.securityContext.readOnlyRootFilesystem) && c.securityContext.readOnlyRootFilesystem)
message: 'all containers must set readOnlyRootFilesystem to true'
- expression: object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.allowPrivilegeEscalation) || !c.securityContext.allowPrivilegeEscalation)
message: 'all containers must NOT set allowPrivilegeEscalation to true'
- expression: object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.Privileged) || !c.securityContext.Privileged)
message: 'all containers must NOT set privileged to true'
使用 kubectl
创建策略。很好,目前没有报错。但是让我们取回策略对象并查看其状态。
kubectl get -oyaml validatingadmissionpolicies/pod-security.policy.example.com
status:
typeChecking:
expressionWarnings:
- fieldRef: spec.validations[3].expression
warning: |
apps/v1, Kind=Deployment: ERROR: <input>:1:76: undefined field 'Privileged'
| object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.Privileged) || !c.securityContext.Privileged)
| ...........................................................................^
ERROR: <input>:1:128: undefined field 'Privileged'
| object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.Privileged) || !c.securityContext.Privileged)
| ...............................................................................................................................^
该策略已根据其匹配类型 apps/v1.Deployment
进行了检查。查看 fieldRef
,问题出在第 3 个表达式(索引从 0 开始)。有问题的表达式访问了一个未定义的 Privileged
字段。啊,看起来是复制粘贴的错误。字段名应该是小写的。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: "pod-security.policy.example.com"
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: object.spec.template.spec.containers.all(c, has(c.securityContext) && has(c.securityContext.runAsNonRoot) && c.securityContext.runAsNonRoot)
message: 'all containers must set runAsNonRoot to true'
- expression: object.spec.template.spec.containers.all(c, has(c.securityContext) && has(c.securityContext.readOnlyRootFilesystem) && c.securityContext.readOnlyRootFilesystem)
message: 'all containers must set readOnlyRootFilesystem to true'
- expression: object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.allowPrivilegeEscalation) || !c.securityContext.allowPrivilegeEscalation)
message: 'all containers must NOT set allowPrivilegeEscalation to true'
- expression: object.spec.template.spec.containers.all(c, !has(c.securityContext) || !has(c.securityContext.privileged) || !c.securityContext.privileged)
message: 'all containers must NOT set privileged to true'
再次检查其状态,你应该会看到所有警告都已清除。
接下来,让我们为测试创建一个命名空间。
kubectl create namespace policy-test
然后,我将策略绑定到该命名空间。但此时,我将操作设置为 Warn
,以便策略打印出警告而不是拒绝请求。这在开发和自动化测试期间收集所有表达式的结果时特别有用。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: "pod-security.policy-binding.example.com"
spec:
policyName: "pod-security.policy.example.com"
validationActions: ["Warn"]
matchResources:
namespaceSelector:
matchLabels:
"kubernetes.io/metadata.name": "policy-test"
测试策略的强制执行效果。
kubectl create -n policy-test -f- <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
securityContext:
privileged: true
allowPrivilegeEscalation: true
EOF
Warning: Validation failed for ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com': all containers must set runAsNonRoot to true
Warning: Validation failed for ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com': all containers must set readOnlyRootFilesystem to true
Warning: Validation failed for ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com': all containers must NOT set allowPrivilegeEscalation to true
Warning: Validation failed for ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com': all containers must NOT set privileged to true
Error from server: error when creating "STDIN": admission webhook "webhook.example.com" denied the request: [container "nginx" must set RunAsNonRoot to true in its SecurityContext, container "nginx" must set ReadOnlyRootFilesystem to true in its SecurityContext, container "nginx" must NOT set AllowPrivilegeEscalation to true in its SecurityContext, container "nginx" must NOT set Privileged to true in its SecurityContext]
看起来很棒!策略和 Webhook 给出了等效的结果。经过其他几个案例的测试,当我们对策略充满信心时,也许是时候进行一些清理了。
- 对于每个表达式,我们都重复访问
object.spec.template.spec.containers
和每个securityContext
; - 存在一种检查字段是否存在然后访问它的模式,这看起来有点冗长。
幸运的是,自 Kubernetes 1.28 起,我们为这两个问题提供了新的解决方案。变量组合(Variable Composition)允许我们将重复的子表达式提取到它们自己的变量中。Kubernetes 为 CEL 启用了可选库,这对于处理可选字段非常有用。
考虑到这两个特性,让我们对策略进行一些重构。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: "pod-security.policy.example.com"
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
variables:
- name: containers
expression: object.spec.template.spec.containers
- name: securityContexts
expression: 'variables.containers.map(c, c.?securityContext)'
validations:
- expression: variables.securityContexts.all(c, c.?runAsNonRoot == optional.of(true))
message: 'all containers must set runAsNonRoot to true'
- expression: variables.securityContexts.all(c, c.?readOnlyRootFilesystem == optional.of(true))
message: 'all containers must set readOnlyRootFilesystem to true'
- expression: variables.securityContexts.all(c, c.?allowPrivilegeEscalation != optional.of(true))
message: 'all containers must NOT set allowPrivilegeEscalation to true'
- expression: variables.securityContexts.all(c, c.?privileged != optional.of(true))
message: 'all containers must NOT set privileged to true'
现在策略变得更加清晰和易读。更新策略,你应该会看到它的功能与之前相同。
现在让我们将策略绑定的操作从警告改为实际拒绝验证失败的请求。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: "pod-security.policy-binding.example.com"
spec:
policyName: "pod-security.policy.example.com"
validationActions: ["Deny"]
matchResources:
namespaceSelector:
matchLabels:
"kubernetes.io/metadata.name": "policy-test"
最后,移除 Webhook。现在结果应该只包含来自策略的消息。
kubectl create -n policy-test -f- <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
securityContext:
privileged: true
allowPrivilegeEscalation: true
EOF
The deployments "nginx" is invalid: : ValidatingAdmissionPolicy 'pod-security.policy.example.com' with binding 'pod-security.policy-binding.example.com' denied request: all containers must set runAsNonRoot to true
请注意,根据设计,策略将在第一个导致请求被拒绝的表达式处停止评估。这与表达式仅生成警告时的情况不同。
设置监控
与 Webhook 不同,策略不是一个可以暴露自己指标的专用进程。相反,你可以使用 API 服务器的指标来替代。
以下是一些使用 Prometheus 查询语言(Prometheus Query Language)执行常见监控任务的示例。
查找上述策略的第 95 百分位执行时长。
histogram_quantile(0.95, sum(rate(apiserver_validating_admission_policy_check_duration_seconds_bucket{policy="pod-security.policy.example.com"}[5m])) by (le))
查找策略评估的速率。
rate(apiserver_validating_admission_policy_check_total{policy="pod-security.policy.example.com"}[5m])
你可以阅读指标参考以了解有关上述指标的更多信息。ValidatingAdmissionPolicy 的指标目前处于 Alpha 阶段,随着稳定性在未来版本中提升,将会引入更多更好的指标。