The security architecture we have adopted in Aporeto worked. The Aporeto enforcers were preventing any attack against CVE-2018-1002105 before it was even announced. In this blog we will illustrate how a separation of concerns architecture that isolates security functions while placing them close to applications can maximize security defenses, even when the application protected follow the strictest security guidelines.
The Kubernetes control plane is one of the most secure instantiations of system software. Security has been at the core of all development and culture since day one and at a level of design excellence that few software distributions can claim. Use of mTLS everywhere, RBAC and authorizations, code reviews, and based on Go that removes a lot of the classic code vulnerabilities. Despite that, a major vulnerability was discovered against the Kubernetes API server, that not many people were expecting.
Significant lessons from this incident:
- Even the most secure software can have bugs that lead to vulnerabilities.
- Responding to the incident often requires fast patching. In some cases, where service providers manage Kubernetes clusters, this might be seamless. For private Kubernetes deployments though, who wants to touch the control plane in the middle of the night?
- Vulnerabilities such as CVE-2018-1002105 can go undetected for a very long time. No security tool can easily detect such an attack. Traffic logs, API audit logs, and system behavior would look entirely reasonable.
- While managing security for applications is important, securing the control plane of any automated system is where the process has to start. It really does not matter if you deploy the most advanced tools to manage the security lifecycle of applications, if the control plane of an orcherstration system can bypass every security control.
So, what can we do about it, other than a quick patch?
Our enforcers blocked any attack against the disclosed vulnerability for all deployments that are using Aporeto to protect the Kubernetes control plane. As we demonstrate in this blog, a vulnerable Kubernetes API server protected by Aporeto enforcers is protected even when API authorization is entirely open. This was achieved by embracing security principles that have been known for a long time.
- Our thesis is that software will always be vulnerable. Maximum protection is possible only by mandating a zero-trust approach based on end-to-end authentication and authorization everywhere.
- Separation of concerns between the “security components” of the software and the actual application code is at the core of proper security hygiene. This architecture, inspired by the Factotum concepts of Plan9, allows us to achieve several distinct goals:
– Manage the lifecycle of security functions independently of the lifecycle of applications or the underlying operating system kernel.
– Offload complex security and cryptographic functions from the application to a specialized core component of the system.
– Enable code from completely different development environments to participate in security protection, thus avoiding replications of libraries and code that have the same vulnerability source. Based on our design, even if the Aporeto enforcers had such vulnerability, the patching process would not require a complete re-deployment of applications. Instead, we would merely redeploy the security sub-system of the application.
– Enable separation of concerns where security policy is managed idependently, but close and in relation to the application lifecycle.
To some extent, this philosophy has been adopted by service mesh architectures. In the next sections, we describe the vulnerability and the Aporeto approach is preventing this vulnerability in the first place.
In the video below, we are demonstrating a vulnerable Kubernetes cluster. We show that the cluster can be attacked using the utility published in GitHub. We subsequently enable the Aporeto enforcers and protect the Kubernetes API server by simple policies, and we demonstrate that the vulnerability is not exploitable any more.
Several articles have described in detail the vulnerability that was discovered in Kubernetes and the origin of the discovery. The Kubernetes API server proxies requests on the API to downstream components like the Kubelet using mutual TLS authorization. In the specific case, the Kubernetes API uses a proxy technique to enable a websocket tunnel between a client and remote execution in a pod, often useful for troubleshooting. Unfortunately, this websocket upgrade does not have all proper validations. As a result, clients can bypass the API authorization after a failed request and access any pod or other API over the same connection. Websockets and upgrades are notorious for vulnerabilities since they enable a direct TCP connection between clients and servers that can be hijacked through several mechanisms. Indeed, many penetration tests will flag the use of websockets as potential sources of errors. More detailed explanations can be found on the Gravitational blog and Rancher blog.
The sequence of operations are:
- A client issues an invalid websocket upgrade request against the API server.
- The request fails to upgrade.
- The API server ignores this failure and hijacks and splices the TCP connections between the client and the target.
- The client can now issue subsequent requests over the same set of TCP connections, and these requests are no longer authenticated and authorized.
The code patch for this vulnerability can be found in GitHub.
It inserts a small check that prevents the stitching of the TCP sessions if the request fails:
The Aporeto Solution
The Aporeto solution introduces transparent authentication, authorization, and encryption at Layers 3-7 for any protected application. The approach enables us to transparently insert this capability in front of any application, whether it is a Linux process or a container. Furthermore, we can customize the policy for every process in a container, even if the container is exposed using host mode as is the case of the Kubernetes API servers. The Aporeto enforcers are installed as independent software sub-systems in a server, completely decoupled from the kernel or the applications, and they may be upgraded or removed without restarting any of the applications. This fundamental decoupling allows the separation of the “bug-blast-radius” of the enforcers from the underlying kernel and the application itself. Similar to the Factotum architecture, security functions are delegated to the enforcers, whether they are related to access control, secrets management, or cryptographic operations.
The enforcer data path is based on the open source Trireme project, that implements end-to-end authorization at different layers of the stack:
– At L3, authorization is done by injecting identities as part of the connection negotiation process for both TCP and UDP. Specifically for TCP, the identity injection is achieved by leveraging the otherwise free payload of Syn/SynAck packets. It transparently solves the network isolation problem without any dependencies on the underlying network infrastructure. For example, Kubernetes network policies are supported using Trireme over any CNI and any network technology.
– At L4, authorization is done by exchanging identities after establishing a connection, and thus transparently crossing L4 middleboxes. This authentication enables the same level of end-to-end security even when connections are crossing L4 load balancers or proxies.
– At L7, authorization is done at the API layer, by injecting identity exchange at HTTP headers, providing transparent end-to-end authorization over API load balancers and gateways.
In all cases, the Trireme approach is purist. Granular, end-to-end authorization everywhere, no matter how many networks, middleboxes or other elements a connection has to traverse — no dependencies on IP addresses, port numbers, tunnels or any network construct.
Authorization is performed everywhere by distributing cryptographic identities to all workloads and managing policy through a distributed system that is based on attribute-based access control (ABAC). This allows tremendous flexibility on the type of information that can be used for an authorization decision. Identities are normalized on a standard format of PKI signed ephemeral JSON web tokens. No secrets ever leave the enforcers, and no private keys are ever disclosed.
Protection of the Kubernetes API Server
As we demonstrated on the video above the enforcer can be deployed in any Linux machine. Since daemonsets do not apply in the master node in most deployments, we chose to deploy it as a systemd service. Alternative deployment models can be achieved using daemonsets for the worker nodes of Kubernetes. The enforcers were configured to protect the API service at layers 3 and 7. At layer 3, we initially blocked any access from any network and demonstrated that the API server is not accessible by the clients. After selectively enabling clients, at Layer 7, we enable API authorization. This authorization allows us to selectively allow API calls to pass through the enforcement functions if they present the proper identity. For the specific demonstration, we allowed all API calls to go through the enforcer since Kubernetes itself is expected to perform API authorization.
When the script attempted to trigger the vulnerability, the enforcer handled the error response message of the downstream Kubernetes API server correctly and dropped the related TCP connection, thus preventing subsequent client attempts to access the server. The policy language allows operators to define additional authorization rules on any API paths at the Aporeto enforcer that are orthogonal to the Kubernetes policies. This approach allows a separation of concerns capability where even a Kubernetes use with administrative access in Kubernetes cannot enable RBAC rules that can violate the security policies of an enterprise.
Any deployment must protect the most critical assets first, and in the case of Kubernetes this is the control plane. Any interaction between users or machines and the control plane must be considered at the top of the priority list and mechanisms like the Aporeto enforcers enable a second level of protection in these critical assets. In the control plane we have to include all the componets of the Kubernetes master nodes, such as scheduler, API server, etcd cluster, and even the UI server that is known to have significant security issues.
The Benefits of Independent Security Actors
We do not claim that we have either predicted or knew about all the possible scenarios that would lead to such a vulnerability and intentionally put mechanisms in place to prevent it. Our choice was first to follow our design philosophy that says that all security code has to be decoupled from application code, as the second level of protection. This decoupling goes back to the paradigms of old reliability techniques for critical systems, running different versions of software implementing the same specification.
Our second choice was to use the correct libraries in the design process. We went down the path of implementing web sockets ourselves, and we considered the native Go web socket library. However, reading the code, or the reading the comments at Github drove us to a new decision:
So, we ended up using the gorilla WebSocket library, and when properly configured it provides the right protections. So, *kudos to the gorilla web socket developers* for correctly implementing web sockets: Github. Security is a joint responsibility, and the community proved to be reliable again.
Interested in learning more about how Aporeto works with Kubernetes? Visit our blog here.
If you are at KubeCon this week, please stop by our booth (S/E29) where we will be demonstrating this solution.