Harden Docker with CIS – (P6) Container Runtime Configuration – Part 1

In this post, we’ll cover the last section of the CIS Benchmark for Docker. There are a few more sections in the CIS benchmark; however, I’ll stop here, as the others are about Docker Swarm, Docker EE, etc., which I am not familiar with. Since this section is huge, I have divided it into two parts. In part 1, we’ll cover the first 16 CIS recommendations for Container runtime configurations, and the latter will be covered in part 2.

CIS ControlDescription
5.1Ensure that, if applicable, an AppArmor Profile is enabled
5.2Ensure that, if applicable, SELinux security options are set
5.3Ensure that Linux kernel capabilities are restricted within containers
5.4Ensure that privileged containers are not used
5.5Ensure sensitive host system directories are not mounted on containers
5.6Ensure sshd is not run within containers
5.7Ensure privileged ports are not mapped within containers
5.8Ensure that only needed ports are open on the container
5.9Ensure that the host’s network namespace is not shared
5.1Ensure that the memory usage for containers is limited
5.11Ensure that CPU priority is set appropriately on containers
5.12Ensure that the container’s root filesystem is mounted as read only
5.13Ensure that incoming container traffic is bound to a specific host interface
5.14Ensure that the ‘on-failure’ container restart policy is set to ‘5’
5.15Ensure that the host’s process namespace is not shared
5.16Ensure that the host’s IPC namespace is not shared
5.17Ensure that host devices are not directly exposed to containers
5.18Ensure that the default ulimit is overwritten at runtime if needed
5.19Ensure mount propagation mode is not set to shared
5.20Ensure that the host’s UTS namespace is not shared
5.21Ensure the default seccomp profile is not Disabled
5.22Ensure that docker exec commands are not used with the privileged option
5.23Ensure that docker exec commands are not used with the user=root option
5.24Ensure that cgroup usage is confirmed
5.25Ensure that the container is restricted from acquiring additional privileges
5.26Ensure that container health is checked at runtime
5.27Ensure that Docker commands always make use of the latest version of their image
5.28Ensure that the PIDs cgroup limit is used
5.29Ensure that Docker’s default bridge “docker0” is not used
5.30Ensure that the host’s user namespaces are not shared
5.31Ensure that the Docker socket is not mounted inside any containers

5.1-5.2 Ensure that, if applicable, an AppArmor/SELinux Profile is enabled

AppArmor profiles to Debian systems are like what SELinux profiles are to Redhat/CentOS machines. The details of what should go in an AppArmor/SELinux profile are out of this document’s scope. However, a container should always run with one attached to it.

Verify if the running containers have any AppArmor/SELinux profile attached to them or not

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: SecurityOpt={{ .HostConfig.SecurityOpt }}'
Code language: Bash (bash)

To enable AppArmor profile

docker run --interactive --tty --security-opt="apparmor:PROFILENAME" ubuntu /bin/bash
Code language: Bash (bash)

To enable SELinux profile

# Run docker daemon with SELinux enabled docker daemon --selinux-enabled # Set the appropriate labels, levels and profile docker run --interactive --tty --security-opt label=level:TopSecret centos /bin/bash
Code language: Bash (bash)

5.3 Ensure that Linux kernel capabilities are restricted within containers

Capabilities are essentially the components of permissions on the Linux kernel that make up any process and define what it can do and what it can not. Getting into the details of how and what Linux Kernel capabilities are, is out of the scope. However, if you know what capabilities your process utilizes, you should also know how to restrict it in a docker environment.

List the capabilities of the running containers

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: CapAdd={{ .HostConfig.CapAdd }} CapDrop={{ .HostConfig.CapDrop }}'
Code language: Bash (bash)
# Add capabilities docker run --cap-add={"Capability 1","Capability 2"} <Run arguments> <Container Image Name or ID> <Command> # Drop capailities docker run --cap-drop={"Capability 1","Capability 2"} <Run arguments> <Container Image Name or ID> <Command> # Drop all capabilities and then add select few docker run --cap-drop=all --cap-add={"Capability 1","Capability 2"} <Run arguments> <Container Image Name or ID> <Command> # The default list of capabilities applied to a container by Docker <em>AUDIT_WRITE, CHOWN, DAC_OVERRIDE, FOWNER, FSETID, KILL, MKNOD, NET_BIND_SERVICE, NET_RAW, SETFCAP, SETGID, SETPCAP, SETUID, and SYS_CHROOT</em>
Code language: Bash (bash)

5.4 Ensure that privileged containers are not used

If you run a docker container with --privileged flag, essentially what you have done is, given that container all the possible capabilities that a process can have on the host system. This, even by the sound of it, looks like a terrible idea and should not be, unless and until you are looking to run Docker inside Docker.

Checking for all the containers running with privileged flag

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Privileged={{ .HostConfig.Privileged }}'
Code language: Bash (bash)

Running privileged container

docker run --interactive --tty --privileged centos /bin/bash
Code language: Bash (bash)

5.5 Ensure sensitive host system directories are not mounted on containers

Mounting host-sensitive directories directly to the container is not a good idea, and if done with read-write permissions, it is a rather huge mistake.

To check all the current mounts of the running containers

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Volumes={{ .Mounts }}'
Code language: Bash (bash)

Docker defaults to using a read-write volume, but you can also mount a directory read-only. By default, no sensitive host directories are mounted within containers.

5.6 Ensure sshd is not run within containers

First of all, you should not be SSHing into the running containers, use docker exec to get into the containers, if you have to at all.

For every docker container run the following to get the process list and verify if there is sshd daemon running

docker exec $INSTANCE_ID ps -el
Code language: Bash (bash)

5.7 Ensure privileged ports are not mapped within containers

This is a very subjective recommendation; in my opinion, this depends on the use cases you are dealing with. However, by default, if the user does not specifically declare a container port to host port mapping, Docker automatically and correctly maps the container port to one available in the 49153-65535 range on the host.

To list all the port mappings for all the running containers

docker ps --quiet | xargs docker inspect --format '{{ .Id }}: Ports={{ .NetworkSettings.Ports }}'
Code language: Bash (bash)

5.8 Ensure that only needed ports are open on the container

List all the port mappings

docker ps --quiet | xargs docker inspect --format '{{ .Id }}: Ports={{ .NetworkSettings.Ports }}'
Code language: Bash (bash)

Review the list and ensure that all the ports mapped are, in fact, genuinely required by each container. If there are not required ports, consider removing them from the mapping and restarting the containers. Another thing would be to not start docker containers with --publish-all flag, this maps all the ports exposed in the container.

5.9 Ensure that the host’s network namespace is not shared

By default, containers are running using the bridge network mode in docker. However, the containers should not be run using the host network; it gives all the capabilities as root has on the network to the container process. To verify the network mode, use the following command

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: NetworkMode={{ .HostConfig.NetworkMode }}'
Code language: Bash (bash)

5.10 Ensure that the memory usage for containers is limited

Availability is one of the components of the CIA triad. Thus it should be taken into consideration while deploying containers. By default, all the containers share the memory equally on a Docker host. However, this doesn’t restrict them from hogging up all the memory and, in turn, making other processes non-functional. For this very reason, it is advised to run containers with memory limits.

Command to check memory limits

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Memory={{ .HostConfig.Memory }}'
Code language: Bash (bash)

If output is 0 for all the containers, this means there are no memory limits in place, and you should consider putting some.

docker run --interactive --tty --memory 256m centos /bin/bash
Code language: Bash (bash)

5.11 Ensure that CPU priority is set appropriately on containers

Just like memory, CPU shares are also equally shared across containers and should be considered for limits. To verify if CPU limits are in place

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: CpuShares={{ .HostConfig.CpuShares }}'
Code language: Bash (bash)

If the output is 0 or 1024, it means CPU limits are not in place.

To apply CPU limits to a container

docker run --interactive --tty --cpu-shares 512 centos /bin/bash
Code language: Bash (bash)

There are a total of 1024 CPU shares available on a host system. Thus a value of 512 signifies that this container is only allowed to hog up to 50% of the CPU. Similarly, a value of 256 would mean the container can only use 25% of the CPU capacity.

5.12 Ensure that the container’s root filesystem is mounted as read only

The container’s root filesystem should be treated as a ‘golden image’ using Docker run’s –read-only option. This prevents any writes to the container’s root filesystem at container runtime and enforces the principle of immutable infrastructure.

To check if containers are running with read-only root file system

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: ReadonlyRootfs={{ .HostConfig.ReadonlyRootfs }}'
Code language: Bash (bash)

Enabling –read-only at container runtime may break some container OS packages if a data writing strategy is not defined.

--tmpfs , shared mounts, and volume mounts can be utilized to map spaces in containers for read-write capabilities for the running program.

5.13 Ensure that incoming container traffic is bound to a specific host interface

By default, Docker exposes the container ports on 0.0.0.0, the wildcard IP address that will match any possible incoming network interface on the host machine. This behavior may not be desired if the machine has multiple network interfaces, e.g., LAN and WAN interfaces, and the container in question is spun up for testing purposes and should be exposed to LAN.

To verify network interface mapping

docker ps --quiet | xargs docker inspect --format '{{ .Id }}: Ports={{ .NetworkSettings.Ports }}'
Code language: Bash (bash)

To map to a specific network interface using the following structure of the publish command

docker run --detach --publish 10.2.3.4:49153:80 nginx
Code language: Bash (bash)

5.14 Ensure that the ‘on-failure’ container restart policy is set to ‘5’

To verify the restart policy and other configruation options use the following command

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: RestartPolicyName={{ .HostConfig.RestartPolicy.Name }} MaximumRetryCount={{ .HostConfig.RestartPolicy.MaximumRetryCount }}'
Code language: Bash (bash)

Setting restart policy to always, will result in a failed container trying to recover by restarting all the time. There may be other problems with the container and thus won’t come under notice. So, ensure that the restart policy is set to on-failure with a definite count.

docker run --detach --restart=on-failure:5 nginx

Nginx container is set to restart on failure, and try this for 5 times.

5.15 Ensure that the host’s process namespace is not shared

PID namespace provides separation between processes. It prevents system processes from being visible and allows process ids to be reused, including PID 1. If the host’s PID namespace is shared with containers, it will allow these to see all of the host system’s processes.

Check the PID mode for all the running containers

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: PidMode={{ .HostConfig.PidMode }}'
Code language: Bash (bash)

Do not start any container with --pid=host

docker run --interactive --tty --pid=host centos /bin/bash
Code language: Bash (bash)

5.16 Ensure that the host’s IPC namespace is not shared

The IPC namespace provides separation of IPC between the host and containers. If the host’s IPC namespace is shared with the container, it will allow processes within the container to see all of IPC communications on the host system.

To verify the IPC mode of the running containers

docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: IpcMode={{ .HostConfig.IpcMode }}'
Code language: Bash (bash)

Do not run containers with --ipc=host option

docker run --interactive --tty --ipc=host centos /bin/bash
Code language: Bash (bash)

This completes Part 1 of the “Container Runtime Configuration” section of the CIS Docker Benchmarks. We’ll continue with others in Part 2.

If you have questions or need help setting things up, reach out to me @jtnydv