Harden Docker with CIS — (P6) Container Runtime Configuration — Part 2

In this post, we’ll cover the last section of the CIS Benchmark for Docker. There are a few more sections in the CIS benchmark; however, I’ll stop here, as the others are about Docker Swarm, Docker EE, etc., which I am not familiar with. Since this section is huge, I have divided it into two parts. In part 1, we covered the first 16 CIS recommendations for Container runtime configurations, and the latter will be covered in this post.

CIS ControlDescription
5.1Ensure that, if applicable, an AppArmor Profile is enabled
5.2Ensure that, if applicable, SELinux security options are set
5.3Ensure that Linux kernel capabilities are restricted within containers
5.4Ensure that privileged containers are not used
5.5Ensure sensitive host system directories are not mounted on containers
5.6Ensure sshd is not run within containers
5.7Ensure privileged ports are not mapped within containers
5.8Ensure that only needed ports are open on the container
5.9Ensure that the host’s network namespace is not shared
5.1Ensure that the memory usage for containers is limited
5.11Ensure that CPU priority is set appropriately on containers
5.12Ensure that the container’s root filesystem is mounted as read only
5.13Ensure that incoming container traffic is bound to a specific host interface
5.14Ensure that the ‘on-failure’ container restart policy is set to ‘5’
5.15Ensure that the host’s process namespace is not shared
5.16Ensure that the host’s IPC namespace is not shared
5.17Ensure that host devices are not directly exposed to containers
5.18Ensure that the default ulimit is overwritten at runtime if needed
5.19Ensure mount propagation mode is not set to shared
5.20Ensure that the host’s UTS namespace is not shared
5.21Ensure the default seccomp profile is not Disabled
5.22Ensure that docker exec commands are not used with the privileged option
5.23Ensure that docker exec commands are not used with the user=root option
5.24Ensure that cgroup usage is confirmed
5.25Ensure that the container is restricted from acquiring additional privileges
5.26Ensure that container health is checked at runtime
5.27Ensure that Docker commands always make use of the latest version of their image
5.28Ensure that the PIDs cgroup limit is used
5.29Ensure that Docker’s default bridge “docker0” is not used
5.30Ensure that the host’s user namespaces are not shared
5.31Ensure that the Docker socket is not mounted inside any containers

5.17 Ensure that host devices are not directly exposed to containers

Well this is a no brainer, don’t mount devices mapped on host, directly to the containers. The container would not need to run in privileged mode to access and manipulate them, as by default, the container is granted this type of access. If devices are absolutely necessary to be shared with containers, then appropriate permissions must be set and associated risk should be taken into consideration.

To check for mapped devices

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Devices={{ .HostConfig.Devices }}'
Code language: Bash (bash)

5.18 Ensure that the default ulimit is overwritten at runtime if needed

ulimit provides control over the resources available to the shell and to processes started by it. Setting system resource limits in a prudent fashion, protects against denial of service conditions.

This covers the “Availability” section of the CIA Triad, and is an important piece. This helps in avoiding DoS takeovers.

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Ulimits={{ .HostConfig.Ulimits }}'
Code language: Bash (bash)

The command above should return Ulimits=<no value> unless there is a need to override the settings.

To run a container with custom Ulimits

docker run --ulimit nofile=1024:1024 --interactive --tty centos /bin/bash
Code language: Bash (bash)

5.19 Ensure mount propagation mode is not set to shared

Mount propagation mode allows mounting volumes in shared, slave or private mode on a container. A shared mount is replicated at all mounts and changes made at any mount point are propagated to all other mount points. Mounting a volume in shared mode does not restrict any other container from mounting and making changes to that volume.

To check for all the mount propagation mode

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Propagation={{range $mnt := .Mounts}} {{json $mnt.Propagation}} {{end}}'
Code language: Bash (bash)

The command above should return anything but shared unless explicitly required.

5.20 Ensure that the host’s UTS namespace is not shared

UTS namespaces provide isolation between two system identifiers: the hostname and the NIS domain name. It is used to set the hostname and the domain which are visible to running processes in that namespace. Sharing the UTS namespace with the host provides full permission for each container to change the hostname of the host.

To check if any of the containers are running with UTS set to host

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: UTSMode={{ .HostConfig.UTSMode }}'
Code language: Bash (bash)

5.21 Ensure the default seccomp profile is not Disabled

This is a key one, but very difficult to implement if the applications have not been audited correctly. This can essentially make the application crash in production if not specified correctly. So please utilize proper tools to identify the seccomp profile for the applications running as containers.

Seccomp filtering provides a means for a process to specify a filter for incoming system calls. The default Docker seccomp profile works on a whitelist basis and allows for a large number of common system calls, whilst blocking all others. This filtering should not be disabled unless it causes a problem with your container application usage.

To check the value of the seccomp profile

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: SecurityOpt={{ .HostConfig.SecurityOpt }}'
Code language: Bash (bash)

The command above should return either <no value> or YOUR_MODIFIED_SECCOMP_PROFILE, but if it returns [seccomp:unconfied] then the container is essentially running without any seccomp profiles.

5.22 Ensure that docker exec commands are not used with the privileged option

This goes hand in had with the control of running docker container without the privileged flag and/or not running them as root. The chained together can present a serious problem to the docker deployment. Using the –privileged option in docker exec commands gives extended Linux capabilities to the command. This could potentially be an insecure practice, particularly when you are running containers with reduced capabilities or with enhanced restrictions.

5.23 Ensure that docker exec commands are not used with the user=root option

Just like the control above, implications are same, however, reaching the end goals are different.

5.24 Ensure that cgroup usage is confirmed

At run time, it is possible to attach a container to a different cgroup other than the one originally defined. This usage should be monitored and confirmed, as by attaching to a different cgroup, excess permissions and resources might be granted to the container and this can therefore prove to be a security risk.

The command returns the cgroup attached to the running container

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: CgroupParent={{ .HostConfig.CgroupParent }}'
Code language: Bash (bash)

If it is blank, it means that containers are running under the default docker cgroup.

5.25 Ensure that the container is restricted from acquiring additional privileges

A process can set the no_new_priv bit in the kernel and this persists across forks, clones and execve. The no_new_priv bit ensures that the process and its child processes do not gain any additional privileges via suid or sgid bits.

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: SecurityOpt={{ .HostConfig.SecurityOpt }}'
Code language: Bash (bash)

Lookout for no-new-privileges in the output of this command for all running containers. You can run a container with restrictions on acquiring additional privileges the following way

docker run --rm -it --security-opt=no-new-privileges ubuntu bash
Code language: Bash (bash)

5.26 Ensure that container health is checked at runtime

Going back to the HEALTHCHECK probe on containers, this setting should be verified that it is in fact running and the containers are healthy.

docker ps --quiet | xargs docker inspect --format '{{ .Id }}: Health={{ .State.Health.Status }}'
Code language: Bash (bash)

5.27 Ensure that Docker commands always make use of the latest version of their image

You should use proper version pinning mechanisms (the “latest” tag which is assigned by default is still vulnerable to caching attacks) to avoid extracting cached older versions. Version pinning mechanisms should be used for base images, packages, and entire images.

5.28 Ensure that the PIDs cgroup limit is used

Attackers could launch a fork bomb with a single command inside the container. This fork bomb could crash the entire system and would require a restart of the host to make the system functional again. Using the PIDs cgroup parameter –pids-limit would prevent this kind of attack by restricting the number of forks that can happen inside a container within a specified time frame.

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: PidsLimit={{ .HostConfig.PidsLimit }}'
Code language: Bash (bash)

The output of the following command should not be 0 or -1. A PidsLimit of 0 or -1 means that any number of processes can be forked concurrently inside the container.

5.29 Ensure that Docker’s default bridge “docker0” is not used

Docker connects virtual interfaces created in bridge mode to a common bridge called docker0. This default networking model is vulnerable to ARP spoofing and MAC flooding attacks as there is no filtering applied to it. You should follow the Docker documentation and set up a user-defined network. All the containers should be run in this network.

docker network ls --quiet | xargs docker network inspect --format '{{ .Name }}: {{ .Options }}'
Code language: Bash (bash)

5.30 Ensure that the host’s user namespaces are not shared

User namespaces ensure that a root process inside the container will be mapped to a non-root process outside the container. Sharing the user namespaces of the host with the container does not therefore isolate users on the host from users in the containers.

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: UsernsMode={{ .HostConfig.UsernsMode }}'
Code language: Bash (bash)

The output of the following command should not be host

5.31 Ensure that the Docker socket is not mounted inside any containers

If the Docker socket is mounted inside a container it could allow processes running within the container to execute Docker commands which would effectively allow for full control of the host.

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Volumes={{ .Mounts }}' | grep docker.sock
Code language: Bash (bash)

This would return any instances where docker.sock had been mapped to a container as a volume.

Well this concludes my Harden Docker with CIS series. It was a very long journey indeed, however, we are done with Docker. I’ll work on new things and keep posting what all I learn. It will be fun ride.

If you have questions or need help setting things up, reach out to me @jtnydv