Docker and most other container platforms have several networking options. In most cases, deployments will be based on bridge networks or some overlay technology. However, in some instances, there is a need for a container to have direct access to the host network. For example, Consul has a use case of exposing a DNS server using host networks. The RedHat OpenShift router or several ingress implementations in Kubernetes instantiate the ingress proxy on a host network so that they can use a fixed IP address.
In general, exposing a container directly on the host network poses several security challenges. The container is attached to the host namespace and can freely interact with any endpoint in the host network. This situation can be especially dangerous for an ingress proxy that is exposed to the Internet. Even a non-privileged container with host network access can become a security vulnerability.
Trireme de-couples security from networking and since it treats containers and Linux processes as equal, it can provide some additional protections, when implementations require access to the host network. Essentially, with Trireme, containers can be still isolated in the network, even though they use the same IP address as the host.
To illustrate how to use Trireme with Docker host networks, we will use the Trireme example with default settings.
% sudo trireme-example daemon --hybrid
% docker run -l app=nginx --net=host -d nginxIf you want to verify that your container is running in host mode, issue the command
% docker inspect <container id> | grep NetworkMode
% curl http://127.0.0.1. The command will fail and timeout.
% docker run -l app=nginx -it nhoag/curlAssuming that your local docker bridge is at 172.17.0.1 (the default in docker) the nginx container should be accessible through the bridge IP. Initiate a curl command to the bridge IP
[email protected]:# curl http://172.17.0.1You will see that curl succeeded in this case. You can now exit from the curl container.
trireme-example run --label=app=nginx curl -- http://127.0.0.1. This command should succeed and you should see the default nginx welcome message. Note, that we started the curl process with the same labels as the nginx container.
We started a docker container with the net=host parameter. The effect of this parameter is that the container uses the host network namespace and has direct access to the interfaces and the network of the host. Doing that without extra controls poses security risks. Trireme allows you to protect even containers started in the host network namespace. Since the container was protected by default by Trireme we instantiated another container and a Linux process and demonstrated how to use the Trireme policies to control which container or Linux process can interact with the host network container.
As we explained in a previous blog, Trireme treats containers and Linux processes equally from a network security perspective. and it can apply granular policy equally well to a container or a Linux process. When a container is activated in the host network namespace, Trireme detects this activation. Instead of giving full access to the container, it treats it as a Linux process. It gets the first process (Pid : 1 of the container) and places it on a dedicated net_cls cgroup as it would do with a Linux processes. All subsequent processes instantiated/forked inside the container inherit by default the same policy. It can then apply granular policy to the particular container, even though the container is in the same namespace as the host network. This policy does not affect any other process or containers running on the same host.
This capability makes Trireme very useful in environments that you need to implement some containers with host network mode.
The Trireme isolation for host networks is of course not as strong as a full network namespace isolation. Containers will still share the same ports and network namespace capabilities. However, in some cases, users are looking for a less granular isolation, since networking with containers can become operationally very complex. See this blog for some discussion on the topic.
Even the original Google Borg architecture used this approach to minimize network complexities:
All containers running on a Borg machine share the host’s IP address, so Borg assigns the containers unique port numbers as part of the scheduling process.
Although this is not optimal, there are several implementations of docker in production environments that have decided to take the risk and instantiate all containers in host network mode since they don’t want to adapt their applications to the concept of either random ports (docker bridge approach) or IP per container (Kubernetes default). One can not discount the pragmatic operational reasons that are leading teams down this path.
Obviously, there is an isolation risk with this decision and Trireme can bridge this gap. By using Trireme as the network policy mechanism, we have decoupled security from the network. Security is delegated to an end-to-end authorization function, and the fact that containers live in the same network namespace does not affect the capability of Trireme to provide strong isolation. Therefore, one can use Trireme to implement a container based system without the need for complex networking.