Docker Host Networks
Docker and most other container platforms have several networking options. In most cases, deployments will be based on bridge networks or some overlay technology. However, in some instances, there is a need for a container to have direct access to the host network. For example, Consul has a use case of exposing a DNS server using host networks. The RedHat OpenShift router or several ingress implementations in Kubernetes instantiate the ingress proxy on a host network so that they can use a fixed IP address.
In general, exposing a container directly on the host network poses several security challenges. The container is attached to the host namespace and can freely interact with any endpoint in the host network. This situation can be especially dangerous for an ingress proxy that is exposed to the Internet. Even a non-privileged container with host network access can become a security vulnerability. So: how do we best apply security for docker containers?
Trireme de-couples security from networking and since it treats containers and Linux processes as equal, it can provide some additional protections, when implementations require access to the host network. Essentially, with Trireme, containers can be still isolated in the network, even though they use the same IP address as the host.
TL;DR Show me how
To illustrate how to use Trireme with Docker host networks, we will use the Trireme example with default settings.
- Download and build trireme-example from https://github.com/aporeto-inc/trireme-example. Follow the instructions in the Readme file and make sure all the dependencies are installed in your system.
- Start trireme-example in one window
% sudo trireme-example daemon --hybrid
- Start a nginx container with host network access. The nginx container will be accessible through the host interface without any network address translation or port mapping.
% docker run -l app=nginx --net=host -d nginx
If you want to verify that your container is running in host mode, issue the command
% docker inspect <container id> | grep NetworkMode
- Despite the fact that your container is running in host mode, it is still protected by Trireme and it cannot be accessed. The default policy in Trireme allows two containers or processes to interact only if they are both protected by Trireme and they have the same labels. Try:
% curl https://127.0.0.1
The command will fail and timeout.
- Instantiate a curl container now with the same labels as the nginx container that we just started.
% docker run -l app=nginx -it nhoag/curl
Assuming that your local docker bridge is at 172.17.0.1 (the default in docker) the nginx container should be accessible through the bridge IP. Initiate a curl command to the bridge IP
root@b84a73c6d5ba:# curl https://172.17.0.1
You will see that curl succeeded in this case. You can now exit from the curl container.
- Since Trireme also supports Linux Processes we can actually access the container from the host, provided that we use Trireme to control the network capabilities of the Linux process. From your host shell, just issue the command:
trireme-example run --label=app=nginx curl -- https://127.0.0.1
This command should succeed and you should see the default nginx welcome message. Note, that we started the curl process with the same labels as the nginx container.
Applying Security for Docker
We started a docker container with the net=host parameter. The effect of this parameter is that the container uses the host network namespace and has direct access to the interfaces and the network of the host. Doing that without extra controls poses security risks. Trireme allows you to protect even containers started in the host network namespace. Since the container was protected by default by Trireme we instantiated another container and a Linux process and demonstrated how to use the Trireme policies to control which container or Linux process can interact with the host network container.
Trireme and Host Networking Architecture
As we explained in a previous blog, Trireme treats containers and Linux processes equally from a network security perspective. and it can apply granular policy equally well to a container or a Linux process. When a container is activated in the host network namespace, Trireme detects this activation. Instead of giving full access to the container, it treats it as a Linux process. It gets the first process (Pid : 1 of the container) and places it on a dedicated net_cls cgroup as it would do with a Linux processes. All subsequent processes instantiated/forked inside the container inherit by default the same policy. It can then apply granular policy to the particular container, even though the container is in the same namespace as the host network. This policy does not affect any other process or containers running on the same host.
This capability makes Trireme very useful in environments that you need to implement some containers with host network mode.
Taking it to the extreme
The Trireme isolation for host networks is of course not as strong as a full network namespace isolation. Containers will still share the same ports and network namespace capabilities. However, in some cases, users are looking for a less granular isolation, since networking with containers can become operationally very complex. See this blog for some discussion on the topic.
Even the original Google Borg architecture used this approach to minimize network complexities:
All containers running on a Borg machine share the host’s IP address, so Borg assigns the containers unique port numbers as part of the scheduling process.
Although this is not optimal security for docker, there are several implementations of docker in production environments that have decided to take the risk and instantiate all containers in host network mode since they don’t want to adapt their applications to the concept of either random ports (docker bridge approach) or IP per container (Kubernetes default). One can not discount the pragmatic operational reasons that are leading teams down this path.
Obviously, there is an isolation risk with this decision and Trireme can bridge this gap. By using Trireme as the network policy mechanism, we have decoupled security from the network. Security is delegated to an end-to-end authorization function, and the fact that containers live in the same network namespace does not affect the capability of Trireme to provide strong isolation. Therefore, one can use Trireme to implement a container based system without the need for complex networking, and improve security for docker.