In this post, we’ll cover the last section of the CIS Benchmark for Docker. There are a few more sections in the CIS benchmark; however, I’ll stop here, as the others are about Docker Swarm, Docker EE, etc., which I am not familiar with. Since this section is huge, I have divided it into two parts. In part 1, we’ll cover the first 16 CIS recommendations for Container runtime configurations, and the latter will be covered in part 2.
CIS Control | Description |
---|---|
5.1 | Ensure that, if applicable, an AppArmor Profile is enabled |
5.2 | Ensure that, if applicable, SELinux security options are set |
5.3 | Ensure that Linux kernel capabilities are restricted within containers |
5.4 | Ensure that privileged containers are not used |
5.5 | Ensure sensitive host system directories are not mounted on containers |
5.6 | Ensure sshd is not run within containers |
5.7 | Ensure privileged ports are not mapped within containers |
5.8 | Ensure that only needed ports are open on the container |
5.9 | Ensure that the host’s network namespace is not shared |
5.1 | Ensure that the memory usage for containers is limited |
5.11 | Ensure that CPU priority is set appropriately on containers |
5.12 | Ensure that the container’s root filesystem is mounted as read only |
5.13 | Ensure that incoming container traffic is bound to a specific host interface |
5.14 | Ensure that the ‘on-failure’ container restart policy is set to ‘5’ |
5.15 | Ensure that the host’s process namespace is not shared |
5.16 | Ensure that the host’s IPC namespace is not shared |
5.17 | Ensure that host devices are not directly exposed to containers |
5.18 | Ensure that the default ulimit is overwritten at runtime if needed |
5.19 | Ensure mount propagation mode is not set to shared |
5.20 | Ensure that the host’s UTS namespace is not shared |
5.21 | Ensure the default seccomp profile is not Disabled |
5.22 | Ensure that docker exec commands are not used with the privileged option |
5.23 | Ensure that docker exec commands are not used with the user=root option |
5.24 | Ensure that cgroup usage is confirmed |
5.25 | Ensure that the container is restricted from acquiring additional privileges |
5.26 | Ensure that container health is checked at runtime |
5.27 | Ensure that Docker commands always make use of the latest version of their image |
5.28 | Ensure that the PIDs cgroup limit is used |
5.29 | Ensure that Docker’s default bridge “docker0” is not used |
5.30 | Ensure that the host’s user namespaces are not shared |
5.31 | Ensure that the Docker socket is not mounted inside any containers |
5.1-5.2 Ensure that, if applicable, an AppArmor/SELinux Profile is enabled
AppArmor profiles to Debian systems are like what SELinux profiles are to Redhat/CentOS machines. The details of what should go in an AppArmor/SELinux profile are out of this document’s scope. However, a container should always run with one attached to it.
Verify if the running containers have any AppArmor/SELinux profile attached to them or not
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: SecurityOpt={{ .HostConfig.SecurityOpt }}'
Code language: Bash (bash)
To enable AppArmor profile
docker run --interactive --tty --security-opt="apparmor:PROFILENAME" ubuntu /bin/bash
Code language: Bash (bash)
To enable SELinux profile
# Run docker daemon with SELinux enabled
docker daemon --selinux-enabled
# Set the appropriate labels, levels and profile
docker run --interactive --tty --security-opt label=level:TopSecret centos /bin/bash
Code language: Bash (bash)
5.3 Ensure that Linux kernel capabilities are restricted within containers
Capabilities are essentially the components of permissions on the Linux kernel that make up any process and define what it can do and what it can not. Getting into the details of how and what Linux Kernel capabilities are, is out of the scope. However, if you know what capabilities your process utilizes, you should also know how to restrict it in a docker environment.
List the capabilities of the running containers
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: CapAdd={{ .HostConfig.CapAdd }} CapDrop={{ .HostConfig.CapDrop }}'
Code language: Bash (bash)
# Add capabilities
docker run --cap-add={"Capability 1","Capability 2"} <Run arguments> <Container Image Name or ID> <Command>
# Drop capailities
docker run --cap-drop={"Capability 1","Capability 2"} <Run arguments> <Container Image Name or ID> <Command>
# Drop all capabilities and then add select few
docker run --cap-drop=all --cap-add={"Capability 1","Capability 2"} <Run arguments> <Container Image Name or ID> <Command>
# The default list of capabilities applied to a container by Docker
<em>AUDIT_WRITE, CHOWN, DAC_OVERRIDE, FOWNER, FSETID, KILL, MKNOD, NET_BIND_SERVICE, NET_RAW, SETFCAP, SETGID, SETPCAP, SETUID, and SYS_CHROOT</em>
Code language: Bash (bash)
5.4 Ensure that privileged containers are not used
If you run a docker container with --privileged
flag, essentially what you have done is, given that container all the possible capabilities that a process can have on the host system. This, even by the sound of it, looks like a terrible idea and should not be, unless and until you are looking to run Docker inside Docker.
Checking for all the containers running with privileged flag
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Privileged={{ .HostConfig.Privileged }}'
Code language: Bash (bash)
Running privileged container
Code language: Bash (bash)docker run --interactive --tty --privileged centos /bin/bash
5.5 Ensure sensitive host system directories are not mounted on containers
Mounting host-sensitive directories directly to the container is not a good idea, and if done with read-write permissions, it is a rather huge mistake.
To check all the current mounts of the running containers
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Volumes={{ .Mounts }}'
Code language: Bash (bash)
Docker defaults to using a read-write volume, but you can also mount a directory read-only. By default, no sensitive host directories are mounted within containers.
5.6 Ensure sshd is not run within containers
First of all, you should not be SSHing into the running containers, use docker exec
to get into the containers, if you have to at all.
For every docker container run the following to get the process list and verify if there is sshd
daemon running
docker exec $INSTANCE_ID ps -el
Code language: Bash (bash)
5.7 Ensure privileged ports are not mapped within containers
This is a very subjective recommendation; in my opinion, this depends on the use cases you are dealing with. However, by default, if the user does not specifically declare a container port to host port mapping, Docker automatically and correctly maps the container port to one available in the 49153-65535 range on the host.
To list all the port mappings for all the running containers
docker ps --quiet | xargs docker inspect --format '{{ .Id }}: Ports={{ .NetworkSettings.Ports }}'
Code language: Bash (bash)
5.8 Ensure that only needed ports are open on the container
List all the port mappings
docker ps --quiet | xargs docker inspect --format '{{ .Id }}: Ports={{ .NetworkSettings.Ports }}'
Code language: Bash (bash)
Review the list and ensure that all the ports mapped are, in fact, genuinely required by each container. If there are not required ports, consider removing them from the mapping and restarting the containers. Another thing would be to not start docker containers with --publish-all
flag, this maps all the ports exposed in the container.
5.9 Ensure that the host’s network namespace is not shared
By default, containers are running using the bridge
network mode in docker. However, the containers should not be run using the host
network; it gives all the capabilities as root has on the network to the container process. To verify the network mode, use the following command
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: NetworkMode={{ .HostConfig.NetworkMode }}'
Code language: Bash (bash)
5.10 Ensure that the memory usage for containers is limited
Availability is one of the components of the CIA triad. Thus it should be taken into consideration while deploying containers. By default, all the containers share the memory equally on a Docker host. However, this doesn’t restrict them from hogging up all the memory and, in turn, making other processes non-functional. For this very reason, it is advised to run containers with memory limits.
Command to check memory limits
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: Memory={{ .HostConfig.Memory }}'
Code language: Bash (bash)
If output is 0 for all the containers, this means there are no memory limits in place, and you should consider putting some.
Code language: Bash (bash)docker run --interactive --tty --memory 256m centos /bin/bash
5.11 Ensure that CPU priority is set appropriately on containers
Just like memory, CPU shares are also equally shared across containers and should be considered for limits. To verify if CPU limits are in place
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: CpuShares={{ .HostConfig.CpuShares }}'
Code language: Bash (bash)
If the output is 0 or 1024, it means CPU limits are not in place.
To apply CPU limits to a container
Code language: Bash (bash)docker run --interactive --tty --cpu-shares 512 centos /bin/bash
There are a total of 1024 CPU shares available on a host system. Thus a value of 512 signifies that this container is only allowed to hog up to 50% of the CPU. Similarly, a value of 256 would mean the container can only use 25% of the CPU capacity.
5.12 Ensure that the container’s root filesystem is mounted as read only
The container’s root filesystem should be treated as a ‘golden image’ using Docker run’s –read-only option. This prevents any writes to the container’s root filesystem at container runtime and enforces the principle of immutable infrastructure.
To check if containers are running with read-only root file system
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: ReadonlyRootfs={{ .HostConfig.ReadonlyRootfs }}'
Code language: Bash (bash)
Enabling –read-only at container runtime may break some container OS packages if a data writing strategy is not defined.
--tmpfs
, shared mounts, and volume mounts can be utilized to map spaces in containers for read-write capabilities for the running program.
5.13 Ensure that incoming container traffic is bound to a specific host interface
By default, Docker exposes the container ports on 0.0.0.0, the wildcard IP address that will match any possible incoming network interface on the host machine. This behavior may not be desired if the machine has multiple network interfaces, e.g., LAN and WAN interfaces, and the container in question is spun up for testing purposes and should be exposed to LAN.
To verify network interface mapping
docker ps --quiet | xargs docker inspect --format '{{ .Id }}: Ports={{ .NetworkSettings.Ports }}'
Code language: Bash (bash)
To map to a specific network interface using the following structure of the publish command
Code language: Bash (bash)docker run --detach --publish 10.2.3.4:49153:80 nginx
5.14 Ensure that the ‘on-failure’ container restart policy is set to ‘5’
To verify the restart policy and other configruation options use the following command
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: RestartPolicyName={{ .HostConfig.RestartPolicy.Name }} MaximumRetryCount={{ .HostConfig.RestartPolicy.MaximumRetryCount }}'
Code language: Bash (bash)
Setting restart policy to always
, will result in a failed container trying to recover by restarting all the time. There may be other problems with the container and thus won’t come under notice. So, ensure that the restart policy is set to on-failure
with a definite count.
docker run --detach --restart=on-failure:5 nginx
Nginx container is set to restart on failure, and try this for 5 times.
5.15 Ensure that the host’s process namespace is not shared
PID namespace provides separation between processes. It prevents system processes from being visible and allows process ids to be reused, including PID 1. If the host’s PID namespace is shared with containers, it will allow these to see all of the host system’s processes.
Check the PID mode for all the running containers
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: PidMode={{ .HostConfig.PidMode }}'
Code language: Bash (bash)
Do not start any container with --pid=host
Code language: Bash (bash)docker run --interactive --tty --pid=host centos /bin/bash
5.16 Ensure that the host’s IPC namespace is not shared
The IPC namespace provides separation of IPC between the host and containers. If the host’s IPC namespace is shared with the container, it will allow processes within the container to see all of IPC communications on the host system.
To verify the IPC mode of the running containers
docker ps --quiet --all | xargs docker inspect --format '{{ .Id }}: IpcMode={{ .HostConfig.IpcMode }}'
Code language: Bash (bash)
Do not run containers with --ipc=host
option
Code language: Bash (bash)docker run --interactive --tty --ipc=host centos /bin/bash
This completes Part 1 of the “Container Runtime Configuration” section of the CIS Docker Benchmarks. We’ll continue with others in Part 2.
If you have questions or need help setting things up, reach out to me @jtnydv