本文已发表超过一年。较旧的文章可能包含过时内容。请检查页面信息自发布以来是否发生变化。

Kubernetes 1.23：Pod 安全毕业进入 Beta

作者：Jim Angel (Google), Lachlan Evenson (Microsoft) | 2021 年 12 月 9 日星期四

随着 Kubernetes v1.23 版本的发布，Pod Security Admission 现已进入 Beta 阶段。Pod Security 是一个内置的准入控制器，它根据一组预定义的Pod 安全标准评估 Pod 规约 (Pod Specification)，并决定是否允许 (admit) 或拒绝 (deny) Pod 运行。

Pod Security 是 PodSecurityPolicy 的继任者，后者已在 v1.21 版本中弃用，并将在 Kubernetes v1.25 中移除。在本文中，我们将介绍 Pod Security 的关键概念以及如何使用它。我们希望集群管理员和开发者都能使用这个新机制来为其工作负载强制执行安全默认设置。

为什么需要 Pod Security

Pod Security 的总体目标是让你隔离工作负载。你可以运行一个包含不同工作负载的集群，并且无需添加额外的第三方工具，即可实施控制，要求工作负载的 Pod 将其自身权限限制在定义的边界集合内。

Pod Security 克服了 Kubernetes 现有但已弃用的 PodSecurityPolicy (PSP) 机制的关键缺点

策略授权模型 — 难以通过控制器部署。
切换风险 — 缺乏 dry-run/审计能力使得启用 PodSecurityPolicy 变得困难。
不一致和无限制的 API — 庞大的配置表面和不断变化的约束导致 API 复杂且令人困惑。

PSP 的缺点使其非常难以使用，这导致社区重新评估是否存在更好的实现可以达到相同的目标。其中一个目标是提供一个开箱即用的解决方案来应用安全最佳实践。Pod Security 附带预定义的 Pod Security 级别，集群管理员可以配置这些级别以满足期望的安全姿态。

需要注意的是，Pod Security 与已弃用的 PodSecurityPolicy 没有完全的功能对等。具体来说，它不具备修改或更改 Kubernetes 资源以代表用户自动修复策略违规的能力。此外，它也无法对 Pod 规约或您希望评估的任何其他 Kubernetes 资源中的每个允许字段和值提供细粒度控制。如果您需要更细粒度的策略控制，请查看这些其他支持此类用例的项目。

Pod Security 还遵循 Kubernetes 声明式对象管理的最佳实践，通过拒绝违反策略的资源来实现。这要求在部署到 Kubernetes 之前，需要在源仓库中更新资源并更新相关工具。

Pod Security 工作原理

从 Kubernetes v1.22 开始，Pod Security 是一个内置的准入控制器，但也可以作为独立的 Webhook 运行。准入控制器的功能是在请求持久化到存储之前拦截 Kubernetes API 服务器中的请求。它们可以允许 (admit) 或拒绝 (deny) 请求。对于 Pod Security，Pod 规约将根据以 Pod Security 标准形式配置的策略进行评估。这意味着 Pod 规约中对安全敏感的字段只允许具有特定值。

配置 Pod Security

Pod 安全标准

为了使用 Pod Security，我们首先需要理解Pod 安全标准。这些标准定义了三个不同的策略级别，从宽松到严格。这些级别如下所示：

privileged — 开放且不受限制
baseline — 涵盖已知的权限提升，同时最小化限制
restricted — 高度受限，防范已知和未知的权限提升。可能会导致兼容性问题

这些策略级别分别定义了 Pod 规约中受限制的字段和允许的值。这些策略限制的一些字段包括：

spec.securityContext.sysctls
spec.hostNetwork
spec.volumes[*].hostPath
spec.containers[*].securityContext.privileged

策略级别通过 Namespace 资源上的标签应用，这允许按命名空间进行细粒度的策略选择。API 服务器中的 AdmissionConfiguration 也可以配置集群范围的默认级别和豁免。

策略模式

策略以特定模式应用。同一命名空间可以设置多种模式（具有不同的策略级别）。以下是模式列表：

enforce — 任何违反策略的 Pod 都将被拒绝
audit — 违规将作为注解记录在审计日志中，但不影响是否允许 Pod。
warn — 违规将向用户发送警告消息，但不影响是否允许 Pod。

除了模式，您还可以将策略固定到特定版本（例如 v1.22）。固定到特定 Kubernetes 版本可以在未来 Kubernetes 版本中策略定义发生变化时保持行为一致性。

动手演示

前提条件

KinD
kubectl
Docker 或 Podman 容器运行时和 CLI

部署一个 kind 集群

kind create cluster --image kindest/node:v1.23.0

启动可能需要一段时间，启动后节点可能还需要一分钟左右才能就绪。

kubectl cluster-info --context kind-kind

等待节点 STATUS 变为 ready。

kubectl get nodes

输出类似如下所示

NAME                 STATUS   ROLES                  AGE   VERSION
kind-control-plane   Ready    control-plane,master   54m   v1.23.0

确认 Pod Security 已启用

确认 API 默认启用的插件的最佳方法是检查 Kubernetes API 容器的帮助参数。

kubectl -n kube-system exec kube-apiserver-kind-control-plane -it -- kube-apiserver -h | grep "default enabled ones"

输出类似如下所示

...
      --enable-admission-plugins strings
admission plugins that should be enabled in addition
to default enabled ones (NamespaceLifecycle, LimitRanger,
ServiceAccount, TaintNodesByCondition, PodSecurity, Priority,
DefaultTolerationSeconds, DefaultStorageClass,
StorageObjectInUseProtection, PersistentVolumeClaimResize,
RuntimeClass, CertificateApproval, CertificateSigning,
CertificateSubjectRestriction, DefaultIngressClass,
MutatingAdmissionWebhook, ValidatingAdmissionWebhook,
ResourceQuota).
...

PodSecurity 被列在默认启用的准入插件组中。

如果使用云提供商，或者如果无法访问 API 服务器，最好的检查方法是运行一个快速的端到端测试。

kubectl create namespace verify-pod-security
kubectl label namespace verify-pod-security pod-security.kubernetes.io/enforce=restricted
# The following command does NOT create a workload (--dry-run=server)
kubectl -n verify-pod-security run test --dry-run=server --image=busybox --privileged
kubectl delete namespace verify-pod-security

输出类似如下所示

Error from server (Forbidden): pods "test" is forbidden: violates PodSecurity "restricted:latest": privileged (container "test" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "test" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "test" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "test" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "test" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

配置 Pod Security

策略通过标签应用于命名空间。这些标签如下所示：

pod-security.kubernetes.io/<MODE>: <LEVEL> (启用 Pod security 所必需)
pod-security.kubernetes.io/<MODE>-version: <VERSION> (可选，默认为 latest)

每种强制模式都可以指定特定版本。该版本将策略固定到 Kubernetes 发布中附带的版本。固定到特定 Kubernetes 版本可以确保策略行为的确定性，同时为未来 Pod 安全标准的更新保留灵活性。可能的 <MODE(S)> 包括 enforce、audit 和 warn。

何时使用 `warn`？

warn 的典型用途是为未来想要强制执行不同策略的变化做好准备。最常见的两种情况是：

在相同级别但不同版本上使用 warn (例如将 enforce 固定到 restricted+v1.23，将 warn 固定到 restricted+latest)
在更严格的级别上使用 warn (例如 enforce 使用 baseline，warn 使用 restricted)

不建议对与 enforce 完全相同级别+版本的策略使用 warn。在准入序列中，如果 enforce 失败，整个序列会在评估 warn 之前失败。

首先，如果之前没有创建，创建一个名为 verify-pod-security 的命名空间。对于本演示，在打标签时使用 --overwrite 以允许将单个命名空间用于多个示例。

kubectl create namespace verify-pod-security

部署演示工作负载

每个工作负载代表一个更高的安全级别，该级别将无法通过其后的配置文件。

对于以下示例，使用 busybox 容器运行一个 sleep 命令，持续 100 万秒（约等于 11 天）或直到被删除。Pod Security 不关心您选择了哪个容器镜像，而关心 Pod 级别的设置及其对安全的影响。

Privileged 级别和工作负载

对于特权 Pod，使用特权策略。这允许容器内的进程获取新进程（也称为“权限提升”），如果不受信任，可能会很危险。

首先，让我们应用一个 restricted Pod Security 级别进行测试。

# enforces a "restricted" security policy and audits on restricted
kubectl label --overwrite ns verify-pod-security \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/audit=restricted

接下来，尝试在命名空间中部署一个特权工作负载。

cat <<EOF | kubectl -n verify-pod-security apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox-privileged
spec:
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
    securityContext:
      allowPrivilegeEscalation: true
EOF

输出类似如下所示

Error from server (Forbidden): error when creating "STDIN": pods "busybox-privileged" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "busybox" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "busybox" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "busybox" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "busybox" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

现在让我们应用 privileged Pod Security 级别，再试一次。

# enforces a "privileged" security policy and warns / audits on baseline
kubectl label --overwrite ns verify-pod-security \
  pod-security.kubernetes.io/enforce=privileged \
  pod-security.kubernetes.io/warn=baseline \
  pod-security.kubernetes.io/audit=baseline

cat <<EOF | kubectl -n verify-pod-security apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox-privileged
spec:
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
    securityContext:
      allowPrivilegeEscalation: true
EOF

输出类似如下所示

pod/busybox-privileged created

我们可以运行 kubectl -n verify-pod-security get pods 来验证它是否正在运行。使用以下命令清理：

kubectl -n verify-pod-security delete pod busybox-privileged

Baseline 级别和工作负载

>Baseline 策略展示了合理的默认设置，同时防止常见的容器漏洞利用。

让我们回退到 restricted Pod Security 级别进行快速测试。

# enforces a "restricted" security policy and audits on restricted
kubectl label --overwrite ns verify-pod-security \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/audit=restricted

应用工作负载。

cat <<EOF | kubectl -n verify-pod-security apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox-baseline
spec:
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        add:
          - NET_BIND_SERVICE
          - CHOWN
EOF

输出类似如下所示

Error from server (Forbidden): error when creating "STDIN": pods "busybox-baseline" is forbidden: violates PodSecurity "restricted:latest": unrestricted capabilities (container "busybox" must set securityContext.capabilities.drop=["ALL"]; container "busybox" must not include "CHOWN" in securityContext.capabilities.add), runAsNonRoot != true (pod or container "busybox" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "busybox" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

让我们应用 baseline Pod Security 级别，再试一次。

# enforces a "baseline" security policy and warns / audits on restricted
kubectl label --overwrite ns verify-pod-security \
  pod-security.kubernetes.io/enforce=baseline \
  pod-security.kubernetes.io/warn=restricted \
  pod-security.kubernetes.io/audit=restricted

cat <<EOF | kubectl -n verify-pod-security apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox-baseline
spec:
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        add:
          - NET_BIND_SERVICE
          - CHOWN
EOF

输出类似如下所示。请注意，警告与上面测试中的错误消息匹配，但 Pod 仍然成功创建。

Warning: would violate PodSecurity "restricted:latest": unrestricted capabilities (container "busybox" must set securityContext.capabilities.drop=["ALL"]; container "busybox" must not include "CHOWN" in securityContext.capabilities.add), runAsNonRoot != true (pod or container "busybox" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "busybox" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
pod/busybox-baseline created

请记住，我们根据 restricted 配置文件将 verify-pod-security 命名空间设置为 warn。我们可以运行 kubectl -n verify-pod-security get pods 来验证它是否正在运行。使用以下命令清理：

kubectl -n verify-pod-security delete pod busybox-baseline

Restricted 级别和工作负载

>Restricted 策略要求拒绝所有特权参数。它是最安全的，但也牺牲了一定的复杂性。restricted 策略仅允许容器添加 NET_BIND_SERVICE 能力。

虽然我们已经测试过 restricted 作为阻塞功能，但让我们尝试运行一个满足所有条件的应用程序。

首先，我们需要最后一次重新应用 restricted 配置文件。

# enforces a "restricted" security policy and audits on restricted
kubectl label --overwrite ns verify-pod-security \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/audit=restricted

cat <<EOF | kubectl -n verify-pod-security apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox-restricted
spec:
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        add:
          - NET_BIND_SERVICE
EOF

输出类似如下所示

Error from server (Forbidden): error when creating "STDIN": pods "busybox-restricted" is forbidden: violates PodSecurity "restricted:latest": unrestricted capabilities (container "busybox" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "busybox" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "busybox" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

这是因为 restricted 配置文件明确要求将某些值设置为最安全的参数。

通过要求使用显式值，manifests 变得更具声明性，并且整个安全模型可以左移。通过 restricted 级别的准入控制，公司可以基于允许的 manifests 来审计其集群的合规性。

让我们修改每个警告，得到以下文件：

cat <<EOF | kubectl -n verify-pod-security apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox-restricted
spec:
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
    securityContext:
      seccompProfile:
        type: RuntimeDefault
      runAsNonRoot: true
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
        add:
          - NET_BIND_SERVICE
EOF

输出类似如下所示

pod/busybox-restricted created

运行 kubectl -n verify-pod-security get pods 来验证它是否正在运行。输出类似于这样：

NAME               READY   STATUS                       RESTARTS   AGE
busybox-restricted   0/1     CreateContainerConfigError   0          2m26s

让我们通过 kubectl -n verify-pod-security describe pod busybox-restricted 来找出容器未启动的原因。输出类似于这样：

Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Warning  Failed     2m29s (x8 over 3m55s)  kubelet            Error: container has runAsNonRoot and image will run as root (pod: "busybox-restricted_verify-pod-security(a4c6a62d-2166-41a9-b288-20df17cf5c90)", container: busybox)

为了解决这个问题，将有效 UID (runAsUser) 设置为非零（root）值，或者使用 nobody UID (65534)。

# delete the original pod
kubectl -n verify-pod-security delete pod busybox-restricted

# create the pod again with new runAsUser
cat <<EOF | kubectl -n verify-pod-security apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox-restricted
spec:
  securityContext:
    runAsUser: 65534
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
    securityContext:
      seccompProfile:
        type: RuntimeDefault
      runAsNonRoot: true
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
        add:
          - NET_BIND_SERVICE
EOF

运行 kubectl -n verify-pod-security get pods 来验证它是否正在运行。输出类似于这样：

NAME                 READY   STATUS    RESTARTS   AGE
busybox-restricted   1/1     Running   0          25s

使用以下命令清理演示（受限制的 Pod 和命名空间）：

kubectl delete namespace verify-pod-security

此时，如果你想更深入地了解 Linux 权限或特定容器允许的内容，可以 exec 进入控制平面并使用 containerd 和 crictl inspect 进行尝试。

# if using docker, shell into the control plane
docker exec -it kind-control-plane bash

# list running containers
crictl ps

# inspect each one by container ID
crictl inspect <CONTAINER ID>

应用集群范围的策略

除了对命名空间应用标签来配置策略外，你还可以使用 AdmissionConfiguration 资源配置集群范围的策略和豁免。

使用此资源，策略定义默认应用于整个集群，而通过命名空间标签应用的任何策略将优先。

对于 AdmissionConfiguration 配置文件，没有运行时可配置的 API，因此集群管理员需要通过 API 服务器上的 --admission-control-config-file flag 来指定下面文件的路径。

在下面的资源中，我们正在强制执行基线策略，并对基线策略进行警告和审计。我们还使 kube-system 命名空间免于此策略。

不建议在安装后更改控制平面/集群，因此让我们构建一个在新集群中对所有命名空间应用默认策略。

首先，删除当前集群。

kind delete cluster

创建一个 Pod Security 配置，该配置 enforce 和 audit 基线策略，同时使用受限制的配置文件向最终用户 warn。

cat <<EOF > pod-security.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
  configuration:
    apiVersion: pod-security.admission.config.k8s.io/v1beta1
    kind: PodSecurityConfiguration
    defaults:
      enforce: "baseline"
      enforce-version: "latest"
      audit: "baseline"
      audit-version: "latest"
      warn: "restricted"
      warn-version: "latest"
    exemptions:
      # Array of authenticated usernames to exempt.
      usernames: []
      # Array of runtime class names to exempt.
      runtimeClasses: []
      # Array of namespaces to exempt.
      namespaces: [kube-system]
EOF

有关更多选项，请查阅官方的 准入控制器标准 文档。

我们现在有了一个默认的基线策略。接下来将其传递给 kind 配置，以启用 --admission-control-config-file API 服务器参数并传递策略文件。要将文件传递给 kind 集群，请使用配置文件传递其他设置说明。Kind 使用 kubeadm 来配置集群，并且配置文件能够传递 kubeadmConfigPatches 以进行进一步自定义。在我们的例子中，本地文件被挂载到控制平面节点上，路径为 /etc/kubernetes/policies/pod-security.yaml，然后该文件被挂载到 apiServer 容器中。我们还传递 --admission-control-config-file 参数，指向策略所在的位置。

cat <<EOF > kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  kubeadmConfigPatches:
  - |
    kind: ClusterConfiguration
    apiServer:
        # enable admission-control-config flag on the API server
        extraArgs:
          admission-control-config-file: /etc/kubernetes/policies/pod-security.yaml
        # mount new file / directories on the control plane
        extraVolumes:
          - name: policies
            hostPath: /etc/kubernetes/policies
            mountPath: /etc/kubernetes/policies
            readOnly: true
            pathType: "DirectoryOrCreate"
  # mount the local file on the control plane
  extraMounts:
  - hostPath: ./pod-security.yaml
    containerPath: /etc/kubernetes/policies/pod-security.yaml
    readOnly: true
EOF

使用上面定义的 kind 配置文件创建一个新集群。

kind create cluster --image kindest/node:v1.23.0 --config kind-config.yaml

让我们看看 default 命名空间。

kubectl describe namespace default

输出类似如下所示

Name:         default
Labels:       kubernetes.io/metadata.name=default
Annotations:  <none>
Status:       Active

No resource quota.

No LimitRange resource.

让我们创建一个新的命名空间，看看标签是否适用于那里。

kubectl create namespace test-defaults
kubectl describe namespace test-defaults

相同。

Name:         test-defaults
Labels:       kubernetes.io/metadata.name=test-defaults
Annotations:  <none>
Status:       Active

No resource quota.

No LimitRange resource.

可以部署特权工作负载吗？

cat <<EOF | kubectl -n test-defaults apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox-privileged
spec:
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
    securityContext:
      allowPrivilegeEscalation: true
EOF

嗯……是的。至少默认的 warn 级别正在工作。

Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "busybox" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "busybox" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "busybox" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "busybox" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
pod/busybox-privileged created

让我们使用 kubectl -n test-defaults delete pod/busybox-privileged 删除该 pod。

我的配置甚至在工作吗？

# if using docker, shell into the control plane
docker exec -it kind-control-plane bash

# cat out the file we mounted
cat /etc/kubernetes/policies/pod-security.yaml

# check the api server logs
cat /var/log/containers/kube-apiserver*.log 

# check the api server config
cat /etc/kubernetes/manifests/kube-apiserver.yaml

更新：基线策略允许 allowPrivilegeEscalation。虽然我看不到 Pod 安全性的默认强制级别，但它们确实存在。让我们尝试提供一个通过请求 hostNetwork 访问来违反基线的 manifest。

# delete the original pod
kubectl -n test-defaults delete pod busybox-privileged

cat <<EOF | kubectl -n test-defaults apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox-privileged
spec:
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
  hostNetwork: true
EOF

输出类似如下所示

Error from server (Forbidden): error when creating "STDIN": pods "busybox-privileged" is forbidden: violates PodSecurity "baseline:latest": host namespaces (hostNetwork=true)

是的！！！成功了！🎉🎉🎉

我后来发现，另一种检查事物是否按预期运行的方法是查看原始 API 服务器 metrics 端点。

运行以下命令

kubectl get --raw /metrics | grep pod_security_evaluations_total

输出类似如下所示

# HELP pod_security_evaluations_total [ALPHA] Number of policy evaluations that occurred, not counting ignored or exempt requests.
# TYPE pod_security_evaluations_total counter
pod_security_evaluations_total{decision="allow",mode="enforce",policy_level="baseline",policy_version="latest",request_operation="create",resource="pod",subresource=""} 2
pod_security_evaluations_total{decision="allow",mode="enforce",policy_level="privileged",policy_version="latest",request_operation="create",resource="pod",subresource=""} 0
pod_security_evaluations_total{decision="allow",mode="enforce",policy_level="privileged",policy_version="latest",request_operation="update",resource="pod",subresource=""} 0
pod_security_evaluations_total{decision="deny",mode="audit",policy_level="baseline",policy_version="latest",request_operation="create",resource="pod",subresource=""} 1
pod_security_evaluations_total{decision="deny",mode="enforce",policy_level="baseline",policy_version="latest",request_operation="create",resource="pod",subresource=""} 1
pod_security_evaluations_total{decision="deny",mode="warn",policy_level="restricted",policy_version="latest",request_operation="create",resource="controller",subresource=""} 2
pod_security_evaluations_total{decision="deny",mode="warn",policy_level="restricted",policy_version="latest",request_operation="create",resource="pod",subresource=""} 2

监控工具也可以摄取这些指标，用于报告、评估或衡量趋势。

清理

完成后，删除 kind 集群。

kind delete cluster

审计

审计是跟踪集群中正在执行哪些策略的另一种方法。要使用 kind 设置审计，请查阅官方文档 enabling auditing。从版本 1.11 起，Kubernetes 审计日志包含两个注解，用于指示请求是否已授权 (authorization.k8s.io/decision) 以及决策原因 (authorization.k8s.io/reason)。审计事件可以流式传输到 webhook，用于监控、跟踪或告警。

审计事件类似于以下内容：

{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"","pod-security.kubernetes.io/audit":"allowPrivilegeEscalation != false (container \"busybox\" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container \"busybox\" must set securityContext.capabilities.drop=[\"ALL\"]), runAsNonRoot != true (pod or container \"busybox\" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container \"busybox\" must set securityContext.seccompProfile.type to \"RuntimeDefault\" or \"Localhost\")"}}

审计也是评估集群当前与 Pod Security 合规性的良好第一步。Kubernetes 增强提案 (KEP) 暗示了未来 baseline 可能成为未标记命名空间的默认设置。

针对 Pod Security 事件调整的 audit-policy.yaml 配置示例

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
  resources:
    - group: "" # core API group
      resources: ["pods", "pods/ephemeralcontainers", "podtemplates", "replicationcontrollers"]
    - group: "apps"
      resources: ["daemonsets", "deployments", "replicasets", "statefulsets"]
    - group: "batch"
      resources: ["cronjobs", "jobs"]
  verbs: ["create", "update"]
  omitStages:
    - "RequestReceived"
    - "ResponseStarted"
    - "Panic"

启用审计后，如果使用 --audit-log-path，请查看配置的本地文件，或者如果使用 --audit-webhook-config-file，请查看 webhook 的目的地。

如果使用文件 (--audit-log-path)，运行 cat /PATH/TO/API/AUDIT.log | grep "is forbidden:" 查看所有被审计的被拒绝的工作负载。

PSP 迁移

如果你已经在使用 PSP，SIG Auth 创建了一个指南并发布了从 PSP 迁移的步骤。

过程总结如下：

将所有现有的 PSP 更新为非变更的 (non-mutating)
在 warn 或 audit 模式下应用 Pod Security 策略
将 Pod Security 策略升级到 enforce 模式
从 --enable-admission-plugins 中移除 PodSecurityPolicy

SIG Auth 已经提出了提供一个工具来协助迁移的想法，该想法被列为“可选的未来扩展”，目前不在讨论范围之内。更多详情请参阅 KEP。

总结

Pod Security 是一个很有前景的新功能，它提供了一种开箱即用的方式，允许用户改进其工作负载的安全态势。就像任何已成熟到 Beta 阶段的新增强功能一样，我们恳请您试用它，提供反馈，或通过提交 Github Issue 或参加 SIG Auth 社区会议来分享您的经验。我们希望 Pod Security 能够部署在每个集群上，以此作为我们社区将 Kubernetes 安全性作为优先事项的持续努力的一部分。

关于如何使用 Pod Security Admission 功能启用“baseline” Pod Security 标准的逐步指南，请参阅这些专门的教程，其中涵盖了集群级别和命名空间级别所需的配置。

Kubernetes 博客

Kubernetes 1.23：Pod 安全毕业进入 Beta

为什么需要 Pod Security

Pod Security 工作原理

配置 Pod Security

Pod 安全标准

策略模式

动手演示

前提条件

部署一个 kind 集群

确认 Pod Security 已启用

配置 Pod Security

何时使用 `warn`？

部署演示工作负载

Privileged 级别和工作负载

Baseline 级别和工作负载

Restricted 级别和工作负载

应用集群范围的策略

是的！！！成功了！🎉🎉🎉

清理

审计

PSP 迁移

总结

额外资源

Kubernetes 1.23：Pod 安全毕业进入 Beta

为什么需要 Pod Security

Pod Security 工作原理

配置 Pod Security

Pod 安全标准

策略模式

动手演示

前提条件

部署一个 kind 集群

确认 Pod Security 已启用

配置 Pod Security

何时使用 warn？

部署演示工作负载

Privileged 级别和工作负载

Baseline 级别和工作负载

Restricted 级别和工作负载

应用集群范围的策略

是的！！！成功了！🎉🎉🎉

清理

审计

PSP 迁移

总结

额外资源

何时使用 `warn`？