Runtime Container Image Security With Anchore and OpenShift

Runtime Container Image Security With Anchore and OpenShift

I have previously written posts on using Helm on OpenShift and installing Anchore Engine on OpenShift using Helm. Today I'm adding on to these concepts a bit by talking about a relatively new Kubernetes feature called Admission Controllers. There are two types of admission controllers: mutating admission controllers that allow you to make some modifications to resources as they are created, and validating admission controllers that allow you to validate resources. I want to talk about the second type, because it will allow us to hook into the process chain when a pod is created.

You can read up more on the OpenShift/Kubernetes admission controller resource here. It includes a link to an example on how to build your own validator service. Vic Iglesias, a solutions architect at Google, has created an example based on this which makes the call out to the Anchore Engine. We'll be using this example in this article.

In an earlier article, I talked about securing container images that you build as part of your CI/CD pipeline with Jenkins. The Anchore Engine can detect a wide variety of issues, such as libraries that have critical CVEs, Dockerfiles that use ports that are blacklisted, secrets and other sensitive content in images, and other user-defined checks. You can look at this document to see a list of the various policy checks that you can put in place.

Using a validating admission controller, we can have OpenShift make a call out to the Anchore Engine to validate all images used in container deployments. This will add security on images that weren't scanned in your Jenkins pipeline, and also ensure that all images are still secure according to the policies of the Anchore Engine (new vulnerabilities could have been discovered, you may have added additional policy checks, etc.).

The Anchore Engine doesn't have an API to answer the request from Kubernetes directly, so we'll have to install a service that can do that. The workflow looks something like this:

One technical note that I'm still working out: the validating admission controllers that I'll be talking about in this article are a beta feature and are not enabled in OpenShift by default. I have filed an issue with the Origin team to understand how this can be easily done with Minishift or the oc cluster up method of running OpenShift. I'm going to get you most of the way there and then I'll update this with instructions on how to enable this feature when I get the answer. My understanding is that this feature should be enabled in upcoming releases of OpenShift.

Installing OpenShift, Helm and the Anchore Engine

Installing OpenShift

Just like in previous articles in the Container Security With Anchore series, I use oc cluster up to start my cluster. You can look at this article for more info. You can also use MiniShift, a full OKD install, or even the OpenShift Container Platform. The principles used in this article will apply in all cases.

Installing Helm

Similarly, you can read this article about installing Helm on OpenShift.

The helm chart we are going to be running needs to be able to create some resources in the default namespace, and this will require giving the tiller service account a new role: cluster-admin. To do this, we log in as the system:admin user and change to the default project:


$ oc login -u system:admin
Logged into "https://192.168.42.210:8443" as "system:admin" using existing credentials.
...
$ oc project default
Now using project "default" on server "https://192.168.42.210:8443".

Now we are going to edit a special resource called a clusterrolebinding, namely the cluster-admins. We edit this using the oc edit command:


$ oc edit clusterrolebinding cluster-admins

At the bottom is a subjects section, and to that we add the tiller service account:


...
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: system:cluster-admins
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: system:admin
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: system:serviceaccount:tiller:tiller

We have to do one more permission change for the validator service. If you followed my instructions for installing the Anchore Engine earlier, you may have noticed we have to give the anyuid permission (because the images we use currently expect to run things as root):


$ oc adm policy add-scc-to-user anyuid -z default

We need to do this again, but specifically for the analysis-anchore-policy-validator-init-ca service account, which we're going to create soon:


$ oc adm policy add-scc-to-user anyuid system:serviceaccount:anchore-engine:analysis-anchore-policy-validator-init-ca
scc "anyuid" added to: ["system:serviceaccount:anchore-engine:analysis-anchore-policy-validator-init-ca"]

Installing the Anchore Engine

I have installed the Anchore Engine using Helm. I've already written an article on doing that, which you can read here.

Installing the Example Validator Service

As mentioned above, Vic Iglesias from Google created a Helm chart for installing an example Validator service that works with the Anchore Engine, and you can get the chart on GitHub. I'll be working with that chart for the rest of this article.

Chart Changes

Like with the chart for the Anchore Engine, there are a couple small chart changes I need to make because the chart was developed for Kubernetes and not OpenShift. The files and directories I refer to below will be in the kubernetes-anchore-image-validator repo that I cloned from GitHub.

The first issue is that the chart includes installing the Anchore Engine, and we already did that (and had to because of a separate PostgreSQL install). Thus, we need to remove the Anchore Engine as a dependency. We do this by removing the requirements.yaml file in the anchore-policy-validator directory.

The second change is to use the password value that we set back when we set up the Anchore Engine, or use foobar if you hadn't changed it. We will also set the value for the ANCHORE_CLI_URL to be the name of the service for the Anchore Engine (run oc get svc and use the name Helm created - nomadic-jaguar-anchore-engine in my case). I need to make these changes in two files:

  • anchore-policy-validator/templates/default-policy/job.yaml
  • anchore-policy-validator/templates/deployment.yaml

In the job.yaml file, the changes happen around line 30, changing the values for the ANCHORE_CLI_PASS and ANCHORE_CLI_URL keys. It should look like this when you are done:


...
         env:
         - name: ANCHORE_CLI_USER
           value: admin
         - name: ANCHORE_CLI_PASS
           value: foobar
         - name: ANCHORE_CLI_URL
           value: "http://nomadic-jaguar-anchore-engine:8228"
...

The same changes should be made around line 37 in the deployment.yaml file.

The next change is pretty minor. In the hack/install.sh file, a NAMESPACE variable points to where your Anchore Engine was installed (anchore-engine in my case). You can set the NAMESPACE value as an environmental variable or prefix the hack/install.sh command as well (NAMESPACE=anchore-engine hack/install.sh), but I prefer to just change it in the script.

You also need to comment out the helm dep build line as we want to skip the requirements step. We had previously removed the requirements file, but we also have to comment out this step (if we leave in the dep build but have no requirements.yaml file, Helm will error out on the missing file, but if we comment out this line and leave the file, Helm will detect the file and error out because we didn't do a dependency build step). This makes my hack/install.sh file look like this:


#!/bin/bash
RELEASE_NAME=${RELEASE_NAME:-analysis}
NAMESPACE=${NAMESPACE:-anchore-engine}
pushd anchore-policy-validator/
  # helm dep build
  helm install --timeout 600 --namespace $NAMESPACE -n $RELEASE_NAME .
popd

Finally, there is a hack/delete.sh script to clean things up. If you run into trouble or want to back this out, you can just run this script. It uses kubectl instead of oc, so I needed to replace kubectl it in each line. My hack/delete.sh looks like this:


#!/bin/bash
RELEASE_NAME=${RELEASE_NAME:-analysis}
helm delete --purge ${RELEASE_NAME}
oc delete ns test
oc delete role ${RELEASE_NAME}-anchore-policy-validator-init-ca
oc delete rolebinding extension-${RELEASE_NAME}-anchore-policy-validator-init-ca-admin
oc delete configmap ${RELEASE_NAME}-init-ca ${RELEASE_NAME}-default-policy
oc delete jobs ${RELEASE_NAME}-init-ca ${RELEASE_NAME}-default-policy
oc delete clusterrolebinding extension-${RELEASE_NAME}-anchore-policy-validator-init-ca-cluster
oc delete clusterroles ${RELEASE_NAME}-anchore-policy-validator-init-ca-cluster
oc delete validatingwebhookconfiguration ${RELEASE_NAME}-anchore-policy-validator.admission.anchore.io
oc delete serviceaccount ${RELEASE_NAME}-anchore-policy-validator-init-ca
oc delete apiservice v1beta1.admission.example.com

Running the Chart

Now we can run the chart by simply running the hack/install.sh script (and this should be run as the oc system:admin user):


$ oc login -u system:admin
...
$ hack/install.sh                                                   ~/git/kubernetes-anchore-image-validator/anchore-policy-validator ~/git/kubernetes-anchore-image-validator

It takes a few minutes, but eventually you will see a new pod spins up with the policy validator, running alongside our db, engine core and engine worker pods:


$ oc get po
NAME                                                    READY     STATUS    RESTARTS   AGE
analysis-anchore-policy-validator-5fbd8b5659-smk5p      1/1       Running   0          1m
anchore-db-1-4dg9z                                      1/1       Running   0          10h
nomadic-jaguar-anchore-engine-core-666764cd44-s2s2f     1/1       Running   0          10h
nomadic-jaguar-anchore-engine-worker-6b46c4f885-shrps   1/1       Running   0          10h

It also has created a new service, deployment, and replica set for the validator.

Creating the Validating Admission Controller

The last part of the output after we run the chart tells us that we need to run a couple commands to create the validating admission controller. This is the last piece of the puzzle - we installed the Anchore Engine and we installed the Validator service. Now we just need to install the webhook into the flow so that it calls the validator.

The first command is to grab a certificate block in the OpenShift config called the certificate-authority-data. We can do this easily by dumping the config as JSON and using a tool called jq which allows us to easily parse JSON (if you don't have jq you can install it from the jq homepage or installing from your Linux package manager - I ran sudo apt install jq). We set this certificate block into a variable called KUBE_CA. The command we see in the output uses kubectl, but we can just substitute oc:


KUBE_CA=$(oc config view --minify=true --flatten -o json \
    | jq '.clusters[0].cluster."certificate-authority-data"' -r)

The second command creates a file called validating-webook.yaml (I didn't misspell that - it is literally what came in the output and I'm leaving it as is), which is the definition of our webhook. I need to set the values for the clientConfig service properly - the namespace value should be where we put the Anchore Engine and the Validator service (anchore-engine for me) and the name value should point to the analysis-anchore-policy-validator service. That makes the command look like this:


cat > validating-webook.yaml <<EOF
 apiVersion: admissionregistration.k8s.io/v1beta1
 kind: ValidatingWebhookConfiguration
 metadata:
   name: analysis-anchore-policy-validator.admission.anchore.io
 webhooks:
 - name: analysis-anchore-policy-validator.admission.anchore.io
   clientConfig:
     service:
       namespace: anchore-engine
       name: analysis-anchore-policy-validator
       path: /apis/admission.anchore.io/v1beta1/imagechecks
     caBundle: $KUBE_CA
   rules:
   - operations:
     - CREATE
     apiGroups:
     - ""
     apiVersions:
     - "*"
     resources:
     - pods
   failurePolicy: Fail
EOF

The important parts of this resource definition are the clientConfig (call the anchore-engine/analysis-anchore-policy-validator service at path /apis/admission.anchore.io/v1beta1/imagechecks), the rules (which specify the hook should be called when pods are created), and the failurePolicy (which currently says to fail if the image doesn't meet the current security policy - we could also set this to Ignore).

The last step is to apply the yaml to create the webhook:


$ oc apply -f validating-webook.yaml
validatingwebhookconfiguration.admissionregistration.k8s.io "analysis-anchore-policy-validator.admission.anchore.io" created

That's it! The webhook is applied and when the webhook is called, it will call to the validator service which will then call the Anchore Engine.

One Little Problem...

Unfortunately, this is where I have to stop, because as of right now, I can't actually demo this working on OpenShift when using oc cluster up. The admission controller feature is still a beta feature and is disabled by default, and I haven't figured out a way to turn it on in the oc cluster up version of OpenShift (everything else works fine - the only problem is the webhook never gets called). I have an issue filed  to try and determine how this is done, and I'll update this article with more when I do figure that out (or a future release of Origin comes out with it enabled by default).

Conclusion

In this article, I talked about how we can install a validating service to enforce image security on pod creation in Kubernetes and OpenShift. With this, we can apply the powerful security checks provided by the Anchore Engine both during a CI/CD pipeline image build and also at runtime before images are loaded into containers.

The example that I was working with had been developed for Kubernetes, but with a few changes to the Helm chart and some permissions commands, I was able to easily install the example Validator service. Then I just created the webhook to point to it.

Hopefully in the near future I get the answer on how to enable admission controllers in the OpenShift oc cluster up command so that I can demonstrate this working fully. I'll tweet and post updates when I get this resolved.

Related Article