Kubernetes Pod Security Policies allow you to control the security specifications that pods must adhere to in order to run in your cluster. You can block users from deploying inherently insecure pods either intentionally or unintentionally. This sounds like a great feature and a security best practice and can be a big step toward keeping your cluster free of insecure resources.
However, some pods may require additional security permissions beyond what most cluster users are allowed to deploy. For example, monitoring or metrics tooling may need host network access or may need to run in privileged mode. Also, you may need to allow developers to run applications with additional capabilities during early development stages just to make progress.
How hard is it to use Pod Security Policies to judiciously secure your cluster? We’ll look at that in this blog post.
For my environment, I’m using a Kubernetes version 1.15.11 cluster deployed onto Ubuntu VMs using kubeadm. My cluster has one master node and several worker nodes.
While version 1.15 is not the latest Kubernetes version available, it is the latest version used in some managed Cloud environments at this time. For instance, it is the latest version available in EKS (Amazon’s Elastic Kubernetes Service) at the time of writing this blog.
The following diagram shows the namespaces in my cluster. I will use this diagram throughout this blog post to show how Pod Security Policies are used in the cluster:
The namespaces I’ll be using for my work in this blog are the kube-system, default, and development namespaces.
These are the user accounts I’ll be referencing, along with their RBAC roles:
- groups: system:masters
- ClusterRoleBinding: cluster-admin ClusterRole cluster-wide via the system:masters group
- groups: dev
- RoleBinding: edit ClusterRole in the development namespace via the dev group
- RoleBinding: admin ClusterRole in development namespace
- RoleBinding: edit ClusterRole in the default namespace
- group: dev
- RoleBinding: edit ClusterRole in the development namespace via the dev group
Enabling Pod Security Policies
I can create PodSecurityPolicy resources whether or not Pod Security Policies are enabled. However, those resources will have no effect unless I enable Pod Security Policies. Pod Security Policies are controlled by an optional admission controller. Because of this, the first thing I need to do is see if the PodSecurityPolicy admission controller is enabled in my cluster. Since admission controllers are enabled in the configuration of the Kubernetes API Server, I will use the following steps:
- Identify a kube-apiserver pod in my cluster.
- Check that pod to see if the PodSecurityPolicy admission controller is enabled by default for my Kubernetes version.
- If it is not enabled by default, check to see it has been added to the list of enabled admission controllers.
If the PodSecurityPolicy admission controller is not currently enabled, then I will need to edit the manifest file for the kube-apiserver static pod to enable it.
As a side note, if you are using Amazon EKS running Kubernetes version 1.13 or later, then Pod Security Policies are already enabled. If you are running an earlier version of Kubernetes under EKS, then you will need to upgrade to use Pod Security Policies.
Check if the PodSecurityPolicy admission controller is enabled
I’ll do the initial setup work as a user with the cluster-admin role. Later, I’ll switch to a different user to test how the Pod Security Policies work in my cluster:
kubectl config use-context kubernetes-admin@kubernetes
First, find a kube-apiserver pod:
kubectl -n kube-system get pod |grep kube-apiserver kube-apiserver-k3master 1/1 Running 4 12d
Check the default admission controllers
Next, exec into the pod and run the command
kube-apiserver -h | grep enable-admission-plugins. Since there is a lot of output, I have omitted some of the response for brevity:
kubectl -n kube-system exec -it kube-apiserver-k3master sh # kube-apiserver -h | grep enable-admission-plugins --admission-control strings ... REST_OF_LINE_OMITTED ... --enable-admission-plugins strings admission plugins that should be enabled in addition to default enabled ones (NamespaceLifecycle, LimitRanger, ServiceAccount, TaintNodesByCondition, Priority, DefaultTolerationSeconds, DefaultStorageClass, StorageObjectInUseProtection, PersistentVolumeClaimResize, MutatingAdmissionWebhook, ValidatingAdmissionWebhook, ResourceQuota). Comma-delimited list of admission plugins: AlwaysAdmit, AlwaysDeny, AlwaysPullImages, DefaultStorageClass, DefaultTolerationSeconds, DenyEscalatingExec, DenyExecOnPrivileged, EventRateLimit, ExtendedResourceToleration, ImagePolicyWebhook, LimitPodHardAntiAffinityTopology, LimitRanger, MutatingAdmissionWebhook, NamespaceAutoProvision, NamespaceExists, NamespaceLifecycle, NodeRestriction, OwnerReferencesPermissionEnforcement, PersistentVolumeClaimResize, PersistentVolumeLabel, PodNodeSelector, PodPreset, PodSecurityPolicy, PodTolerationRestriction, Priority, ResourceQuota, SecurityContextDeny, ServiceAccount, StorageObjectInUseProtection, TaintNodesByCondition, ValidatingAdmissionWebhook. The order of plugins in this flag does not matter.
I’m interested in the line in the response that starts with –enable-admission-plugins strings. I can see that the default admission controllers for this Kubernetes version are:
My Kubernetes version does not enable Pod Security Policies by default. That means I’ll need to check the commands for the container in the kube-apiserver pod to see if the PodSecurityPolicy admission controller has been added to the list of enabled admission plugins.
Check the enabled non-default admission controllers
I’ll get the kube-apiserver pod details in YAML format to see what arguments were passed to the kube-apiserver command. I have left out the non-applicable content from the response:
kubectl -n kube-system get pod kube-apiserver-k3master -o yaml ... spec: containers: - command: - kube-apiserver - --advertise-address=172.26.0.20 - --allow-privileged=true - --authorization-mode=Node,RBAC - --client-ca-file=/etc/kubernetes/pki/ca.crt - --enable-admission-plugins=NodeRestriction - --enable-bootstrap-token-auth=true - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key - --etcd-servers=https://127.0.0.1:2379 - --insecure-port=0 - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key - --requestheader-allowed-names=front-proxy-client - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt - --requestheader-extra-headers-prefix=X-Remote-Extra- - --requestheader-group-headers=X-Remote-Group - --requestheader-username-headers=X-Remote-User - --secure-port=6443 - --service-account-key-file=/etc/kubernetes/pki/sa.pub - --service-cluster-ip-range=10.96.0.0/12 - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key image: k8s.gcr.io/kube-apiserver:v1.15.11 name: kube-apiserver ...
Optional admission controllers are enabled using the –enable-admission-plugins command argument. I can see that only the NodeRestriction admission controller is enabled (in addition to the default admission controllers). Because of this, I’ll need to edit the manifest file for the pod in the /etc/kubernetes/manifests directory on the master node. But before I enable the PodSecurityPolicy admission controller, I should create and authorize some PodSecurityPolicy resources so that cluster users and service accounts won’t be blocked from starting pods.
It is important to understand that in addition to enabling Pod Security Policies and creating PodSecurityPolicy resources, I must authorize users, groups, and service accounts to use those PodSecurityPolicy resources. I do this with ClusterRoles, RolesBindings, and ClusterRoleBindings.
Since my cluster is a sandbox cluster, I tested enabling Pod Security Policies before creating and authorizing any PodSecurityPolicy resources. After editing the kube-apiserver.yaml file I was able to connect to the API after a short period. However, even though I could connect, the kube-apiserver pod was not listed in the output of
kubectl -n kube-system get pods. Also, the output of
journalctl -u kubeletcontained error lines with content such as pods “kube-apiserver-k3master” is forbidden: unable to validate against any pod security policy: .
This indicates that enabling Pod Security Policies before creating and authorizing PodSecurityPolicy resources can cause significant problems.
Create PodSecurityPolicy resources before enabling Pod Security Policies
To start with, I’ll create a very permissive PodSecurityPolicy resource named privileged that will not block any pods from starting. Next, I’ll use a ClusterRole and ClusterRoleBinding to authorize all authenticated users to use that permissive policy cluster-wide. That way the API Server will restart OK, and the cluster will initially behave the same as it did without Pod Security Policies enabled. After that, I can experiment with more restrictive PodSecurityPolicy resources. For now, my cluster will look like this:
I show the manifest for the privileged policy below. I also show a ClusterRole and ClusterRoleBinding to allow all authenticated users to use that policy. This content is based on examples from the Kubernetes docs here and here.
apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: privileged annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*' spec: privileged: true allowPrivilegeEscalation: true allowedCapabilities: - '*' volumes: - '*' hostNetwork: true hostPorts: - min: 0 max: 65535 hostIPC: true hostPID: true runAsUser: rule: 'RunAsAny' seLinux: rule: 'RunAsAny' supplementalGroups: rule: 'RunAsAny' fsGroup: rule: 'RunAsAny'
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: privileged-psp rules: - apiGroups: ['policy'] resources: ['podsecuritypolicies'] verbs: ['use'] resourceNames: - privileged
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: privileged-psp-system-authenticated roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: privileged-psp subjects: - kind: Group apiGroup: rbac.authorization.k8s.io name: system:authenticated
Create the resources:
kubectl create -f privileged-psp.yaml kubectl create -f privileged-psp-clusterrole.yaml kubectl create -f privileged-psp-clusterrolebinding.yaml
Enable the PodSecurityPolicy admission controller
Enable the PodSecurityPolicy admission controller using the following steps:
- SSH into the master node.
- Edit the /etc/kubernetes/manifests/kube-apiserver.yaml file.
- change the – –enable-admission-plugins=NodeRestriction line to – –enable-admission-plugins=NodeRestriction,PodSecurityPolicy.
- Save the file.
- The kube-apiserver pod will automatically restart.
Once you can connect with to the cluster with kubectl again, check that the kube-apiserver pod restarted and is running without errors.
Try a more restrictive PodSecurityPolicy
Using a very permissive PodSecurityPolicy resource for everything in the entire cluster does not help my security posture, so I’ll create a more restrictive PodSecurityPolicy resource. I’ll use a ClusterRole and RoleBindings to allow all authenticated users in specific namespaces to use the new policy. Next, I’ll create a new RoleBinding to authorize the existing privileged PodSecurityPolicy resource for specific namespaces like kube-system. Finally, I’ll delete the ClusterRoleBinding that I created earlier for the permissive policy so the privileged policy is no longer authorized cluster-wide. This approach accomplishes several goals:
- Allows monitoring and metrics tooling running in the kube-system namespace to start successfully if they need additional security permissions.
- Limits the use of the privileged policy to specific namespaces.
- Demonstrates how cluster admins can set (authorize, technically speaking) restrictive policies for specific users, groups, or service accounts in specific namespaces.
Create the PodSecurityPolicy, ClusterRole, and RoleBinding resources
I’ll create a restrictive PodSecurityPolicy resource based on an example from the Kubernetes docs that does not allow privileged pods to be created. I’ll also use a ClusterRole and RoleBinding to authorize all authenticated users to use the policy in the default namespace.
Until I make more changes, my cluster will look like the following diagram. The new policy does not yet restrict starting any pods since all users are still authorized to use the privileged policy:
apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: no-privilegd-pods-psp spec: privileged: false # Don't allow privileged pods! # The rest fills in some required fields. seLinux: rule: RunAsAny supplementalGroups: rule: RunAsAny runAsUser: rule: RunAsAny fsGroup: rule: RunAsAny volumes: - '*'
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: no-privileged-pods-psp rules: - apiGroups: ['policy'] resources: ['podsecuritypolicies'] verbs: ['use'] resourceNames: - no-privilegd-pods-psp
apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: no-privileged-pods-authenticated roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: no-privileged-pods-psp subjects: - kind: Group apiGroup: rbac.authorization.k8s.io name: system:authenticated
Create the new resources
kubectl create -f no-privileged-pods-psp.yaml kubectl create -f no-privileged-pods-clusterrole.yaml kubectl -n default create -f no-privileged-pods-rolebinding.yaml
Restrict the privileged policy to a single namespace
I’ll create a new RoleBinding that I will use to authorize the privileged policy for all authenticated users in the kube-system namespace. Then I will delete the ClusterRoleBinding for that policy. Then my cluster will look like this, where the privileged policy only controls pods in the kube-system namespace:
apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: privileged-psp-systm-authenticated roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: privileged-psp subjects: - kind: Group apiGroup: rbac.authorization.k8s.io name: system:authenticated
Change the RoleBindings
Apply the new RoleBinding, then delete the old ClusterRoleBinding for the privileged PodSecurityPolicy resource:
kubectl -n kube-system create -f privileged-role-binding.yaml kubectl delete clusterrolebinding privileged-psp-system-authenticated
Test the restrictive policy
At this point, I’m done with the setup work and I’m ready to see how Pod Security Policies restrict what pods can run in the cluster. However, a user with the cluster-admin ClusterRole will be able to start pods regardless of the Pod Security Policies in place. Because of this, I’ll switch to a user with only edit permissions in the default namespace:
kubectl config use-context user-1@kubernetes
I should be able to start a pod in the default namespace that does not specify the use of privileged mode. Even though it is not required, I’ll specify the default namespace in my kubectl commands for clarity:
kubectl -n default run test-nginx --image nginx
The pod (created by a deployment in this case) starts successfully:
kubectl -n default get pods NAME READY STATUS RESTARTS AGE test-nginx-5bd4f6f56f-p7768 1/1 Running 0 4m21s
Next, let’s try to start a similar pod that uses privileged mode. For this test, I’ll use the following manifest file (privileged-nginx.yaml):
apiVersion: v1 kind: Pod metadata: name: privileged-nginx spec: containers: - name: privileged-nginx image: nginx securityContext: privileged: true
kubectl -n default apply -f .\privileged-nginx.yaml Error from server (Forbidden): error when creating ".\privileged-nginx.yaml": pods "privileged-nginx" is forbidden: unable to validate against any pod security policy: [spec.containers.securityContext.privileged: Invalid value: true: Privileged containers are not allowed]
This demonstrates that the restrictive policy blocks users without the cluster-admin ClusterRole from starting pods that use privileged mode.
Be careful with the custer-admin ClusterRole
To emphasize the need to maintain tight control over the use of the cluster-admin ClusterRole, let’s switch the context to a user with the cluster-admin ClusterRole and try again:
kubectl config use-context kubernetes-admin@kubernetes kubectl -n default apply -f .\privileged-nginx.yaml kubectl -n default get po NAME READY STATUS RESTARTS AGE privileged-nginx 1/1 Running 0 1m21s test-nginx-5bd4f6f56f-hnnpq 1/1 Running 0 60m
Based on this testing, you can see that a user with the cluster-admin ClusterRole is able to start a pod despite the Pod Security Policy restrictions. Therefore, be careful who you grant this ClusterRole role to!
I’ll clean up the resources in the default namespace before moving on:
kubectl -n default delete pod,deployment --all
A practical approach to applying Pod Security Policies
The current state of my Pod Security Policies
I started with a very permissive policy (the privileged policy) to avoid any issues when first enabling Pod Security Policies. Next, I created a more restrictive policy for all users in one namespace and limited the use of the permissive policy to the kube-system namespace. This was OK for doing some experimentation to get a feel for how Pod Security Policies work, but it is not very practical for normal use. There are currently only two namespaces that have PodSecurityPolicy resources authorized. This means the only cluster-admin users can start pods in any other namespaces. Let’s verify that. I’ll use the following resources:
- The development namespace that does not have any PodSecurityPolicy resources authorized yet.
- The user-2 user account with a RoleBinding for the edit ClusterRole in the development namespace.
- The dev-admin user account with a RoleBinding for the admin ClusterRole in the development namespace.
First, I’ll try to start a pod in the development namespace using the user-2 user. Next, I’ll try to start the pod as the dev-admin user:
kubectl config use-context user-2@kubernetes Switched to context "user-2@kubernetes". kubectl -n development run nginx-test --restart Never --image nginx Error from server (Forbidden): pods "nginx-test" is forbidden: unable to validate against any pod security policy:  kubectl config use-context dev-admin@kubernetes Switched to context "dev-admin@kubernetes". kubectl -n development run nginx-test --restart Never --image nginx Error from server (Forbidden): pods "nginx-test" is forbidden: unable to validate against any pod security policy: 
So, with the current configuration, I need to authorize a PodSecurityPolicy resource for every namespace for users to be able to start any pods. As a result, I’ll have additional maintenance overhead as I add namespaces for new projects.
Create a secure and functional “base” Pod Security Policy
To reduce the cluster maintenance overhead required just to allow users to do basic work, I can use the following approach:
- First, I can create a very restrictive PodSecurityPolicy resource that I consider safe for any namespace and any user.
- Next, I can authorize that policy for all users on the entire cluster with a ClusterRole and a ClusterRoleBinding.
- After that, I can create less restrictive PodSecurityPolicy resources and authorize them as needed and justified for specific users, groups and service accounts, in specific namespaces.
Configure the “base” policy
First I’ll create a very restrictive PodSecurityPolicy resource named base-psp, again based on an example from the Kubernetes docs. This policy requires pods to run as an unprivileged user and blocks possible escalations to root, among other security considerations. I’ll also create a ClusterRole and ClusterRoleBinding to allow all authenticated cluster users to use the policy.
Now my cluster will look like this. Assuming they have the required RBAC permissions, users can start “safe” pods in any namespace. Pods in the kube-system and default namespaces will be allowed to start with less restrictive security specs than pods in the other namespaces:
apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: base-psp annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default,runtime/default' apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default' seccomp.security.alpha.kubernetes.io/defaultProfileName: 'runtime/default' apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default' spec: privileged: false # Required to prevent escalations to root. allowPrivilegeEscalation: false # This is redundant with non-root + disallow privilege escalation, # but we can provide it for defense in depth. requiredDropCapabilities: - ALL # Allow core volume types. volumes: - 'configMap' - 'emptyDir' - 'projected' - 'secret' - 'downwardAPI' # Assume that persistentVolumes set up by the cluster admin are safe to use. - 'persistentVolumeClaim' hostNetwork: false hostIPC: false hostPID: false runAsUser: # Require the container to run without root privileges. rule: 'MustRunAsNonRoot' seLinux: # This policy assumes the nodes are using AppArmor rather than SELinux. rule: 'RunAsAny' supplementalGroups: rule: 'MustRunAs' ranges: # Forbid adding the root group. - min: 1 max: 65535 fsGroup: rule: 'MustRunAs' ranges: # Forbid adding the root group. - min: 1 max: 65535 readOnlyRootFilesystem: false
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: base-psp rules: - apiGroups: ['policy'] resources: ['podsecuritypolicies'] verbs: ['use'] resourceNames: - base-psp
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: base-psp-system-authenticated roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: base-psp subjects: - kind: Group apiGroup: rbac.authorization.k8s.io name: system:authenticated
Create the resources
I’ll do this work as a user with the cluster-admin ClusterRole.
kubectl create -f base-psp.yaml kubectl create -f base-psp-cluster-role.yaml kubectl create -f base-psp-cluster-role-binding.yaml
Validate the new Pod Security Policy configuration
Let’s try starting pods with different security specifications in the development namespace.
Start a pod with “safe” security specifications
I’ll first try starting with a pod that runs as a non-root user with no permissive security requirements. The base-psp policy should allow this.
I’ll use the following non-root-pod.yaml manifest file for the first pod:
apiVersion: v1 kind: Pod metadata: name: non-root-pod spec: securityContext: runAsUser: 1000 containers: - name: non-root image: busybox command: [ "sh", "-c", "sleep 1h" ]
Create the pod and then check its status:
kubectl -n development create -f non-root-pod.yaml pod/non-root-pod created kubectl -n development get pods NAME READY STATUS RESTARTS AGE non-root-pod 1/1 Running 0 12s
Thus, I can successfully start pods that don’t run as root and don’t require any special permissions
Try to start a pod that runs as the root user
Next, let’s validate that my Pod Security Policies block starting a pod that runs as root in the development namespace. I’ll try to start a pod with a container that runs as root. For this example, I’ll use an NGINX image.
kubectl -n development run nginx-test --restart Never --image nginx pod/nginx-test created kubectl -n development get pods NAME READY STATUS RESTARTS AGE nginx-test 0/1 CreateContainerConfigError 0 29s non-root-pod 1/1 Running 0 10m kubectl -n development describe pod nginx-test Name: nginx-test Namespace: development ... <content omitted> ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 70s default-scheduler Successfully assigned development/nginx-test to k3worker3 Normal Pulled 13s (x6 over 63s) kubelet, k3worker3 Successfully pulled image "nginx" Warning Failed 13s (x6 over 63s) kubelet, k3worker3 Error: container has runAsNonRoot and image will run as root Normal Pulling 2s (x7 over 70s) kubelet, k3worker3 Pulling image "nginx"
The message Error: container has runAsNonRoot and image will run as root shows that the pod is being blocked from starting because it is trying to run a container as root. So, I can start a pod that runs as a non-root user, but I am blocked from starting a pod that runs as root. This shows that the base-psp PodSecurityPolicy resource works as expected!
Check if the less restrictive policies override the base policy
The default namespace has more that one policy authorized. One of those policies does not restrict starting pods that run as root. Let’s validate that the new base-psp PodSecurityPolicy resource does not block my ability to start a pod running as root in the default namespace. I’ll test this with user-1, since that user account has the edit ClusterRole (but not the cluster-admin ClusterRole) in the default namespace:
kubectl config use-context user-1@kubernetes kubectl -n default get pods No resources found in default namespace. kubectl -n default run nginx-test --restart Never --image nginx pod/nginx-test created kubectl -n default get pods NAME READY STATUS RESTARTS AGE nginx-test 1/1 Running 0 25s
This shows that the approach of using a very restrictive “base” PodSecurityPolicy resource for the entire cluster and overriding it for specific users and namespaces works as expected.
The effective use of Pod Security Policies is a topic that requires careful planning. This blog post only scratches the surface. Even so, hopefully, I have shown that it is practical to use Pod Security Policies to help maintain a strong security posture while allowing justified exceptions.
If you have questions or feel like you need help with Kubernetes, Docker or anything related to running your applications in containers, get in touch with us at Capstone IT
Docker Accredited Consultant
Certified Kubernetes Administrator
Certified Kubernetes Application Developer