[Start of recorded material 00:00:00]
Amir Sharif: Good morning, everyone. This is Amir Sharif. I’m the co-founder of Aporeto, and today I’m here to talk to you about why cloud security requires strong application identity. So let’s focus on the title – Why Cloud Security Requires Strong Application Identity. And specifically let’s pick out two words from there – cloud and application – and dig into those constructs. Because they are changing. They are morphing. And precisely because those two big trends are happening that change the cloud and change the applications, we have security challenges that requires a new response.
So let’s talk about the cloud. What’s happening in the cloud space is there’s a massive migration going on from the data center to the cloud space. So we basically have your infrastructure effervescing from the private space into a public space. And basically everybody is in a hybrid cloud environment today, whether they know it or not. I talk with a lot of customers, with a lot of large companies, and the security teams of the companies, may be under the impression that they have to tie control over their cloud assets. Or they may be under the impression that, in fact, they have no cloud assets. But when you dig deeper within the organization, and you don’t have to dig very deep, you realize that there is in fact a shadow IT department going on. People are using various cloud resources, and that migration is happening. And precisely because of that migration, there are new threat vectors and new attacks that are possible on the organization.
The second trend that we have is a migration from the neatly stacked in N-tier application into microservices, where things are talking to themselves per need, and lots of API calls are going on. So basically the application that was a nice, neat stack and was well-defined, is also disaggregating, and we’re migrating into containers and microservices. And that is also creating new security challenges. So let’s go back to the title – Why Cloud Security Requires Strong Application Identity – and here I want to emphasize a word – security. And that’s security in the cloud era as we’re going to the hybrid cloud, and as we’re going to a disaggregated application based on containers and microservices, and why that security requires strong identity.
So let’s step back a little bit and look at what’s going on inside our fabric, inside our data centers. If you’re as old as I am you realize that there used to be a world where you had physical servers within a data center. And that’s what we have in the bottom left corner of the graph. And right now we’re seeing a migration towards VMs and containers and serverless. I would argue that we’re nearly complete with our migration to VMs. We aren’t entirely there. Not everything has been virtualized, but nearly everything that could be virtualized has been. And then we are now at a precipice of migrating towards containers, and eventually we will be in a serverless world. What’s important here is that when we go towards serverless it’s not that there are no servers.
That serverless world is likely to be a set of functions that get checked in into containers. Those containers are, in fact, running on VMs. And the VMs are running on physical servers. So that entire infrastructure is there, but we have a higher level abstraction for ease of operations – better development usage. So as we’re going from servers to VMs to containers and eventually to serverless, there are two key trends that are going on. One is that with every inflection point, the number of endpoints on the network fabric is increasing by an order of magnitude. So basically there are nearly 10 times as many VMs as there are servers. And as we go towards containers there will be 10 times as many containers as there were VMs and so forth.
So the number of endpoints on the fabric is increasing. And correspondingly, they’re becoming more ephemeral. So the change frequency is also increasing by an order of magnitude. These within themselves create operational complexities precisely because at the lower end of the spectrum we’ve dealt in a world that was fairly static and fairly small in scale. We depended on things like network segmentation with software-defined networking. We have software-defined firewalls, that had things like east-west firewalls, netting, ACLs and so forth. And all of that happened slowly enough that a manual control was good enough.
When I was a developer I would require a new subnet. I would file a ticket. That ticket would go to an IT group somewhere. They would provision a subnet for me. Somebody would track it in a spreadsheet, and two weeks later I would have my nice, pristine environment where I could assume that everything would be trusted. Clearly, that type of speed and scale cannot work in the cloud age, where applications are coming and going and where containers are popping up and down very quickly. What we need at the upper scale is an automated framework that is independent of the infrastructure, that the perimeter is basically defined by the application, and it follows the application.
And ideally you want to have the solution as part of the CICD pipeline – meaning that you learn about the application, you learn what the application behavior is, you extract its identity, and you work with this identity in order to secure the application. This is where we want to go. Now the question that you might have is, “Why is the increasing number of endpoints and increasing frequency a challenge for us in the cloud era?” And here it’s worth it to visit some of the tenets that help us have this kind of internet economy today.
The roots of the internet was ARPANET, and the ARPANET was a communication network that was designed to survive a thermonuclear attack – meaning that if you wipe out two thirds of it, the other one third would be discoverable. Every node had to find itself – as this chart here shows – have a simple six-node network. Everything’s finding within itself. As we started to overlay commerce on this network we started creating exceptions because all of a sudden privacy became important. So we created firewall rules. And here, basically, firewall rules are exceptions to an otherwise completely discoverable network. That’s the way security is, by and large, today.
The problem with that, in the face of increasing number of nodes and the higher rate of change, is that we’re calculating an n2 order operation. The graph that I have is an n2 – n divided by 2, type calculation. So if you look at the number of calculations, with a fairly small n-6, we have to compute 16 different nodes. As we go to 60 and 600 and eventually 6,000 – and 6,000 Is not that big – basically we have to calculate 18 million connections in order to resolve the network. Now take the high rate of change. As the change happens we have to kick off this 18 million calculations every time. And as that 6,000 figure gets larger, that n2 order computing gets exceedingly big. That’s a problem that we have with existing security solutions.
What we really want is a whitelist security model. We already know what the application is going to do. In a CICD pipeline we’ll learn about the application. We know what the connectivity pattern is. We know how it ought to behave in operations. And as such, what we really want is a much simpler approach to security. Take the whitelist approach. It’s a default deny. We only allow what is connected, and as a result we go into linear space we grow only at an order of n. So if I have six nodes, basically I need six rules. If I have 6,000 nodes I only need 6,000 rules instead of the 18 million rules on the other side. So this is how cloud security ought to be resolved.
Now what does that mean? As you’re migrating through the cloud your application is touching an exceeding number of services. We are now in an API economy, where we tap into various resources, whether they were developed by us or by third parties and whether it’s our S3 bucket or TensorFlow or cheese or bananas or whatever resource that we’re going to access on the network – we have to be able to manage all of these services. So you have a third order of complexity that’s overlaid, which is the variety of services that we’re going to access everywhere. And to put the icing on the cake, this nice n-tier application is no longer well-defined. It’s actually a network of stuff that are communicating with a network of stuff at worst. Now you get the picture of how complex this is, and hopefully you understand why existing approaches, which are based on network controls, do not work really well.
So let’s talk about strong identity and what that means. Here we have a well-known character. We have his mugshot – Charlie Brown. I don’t know what he did to get in trouble, but I got this photo from the local county sheriff’s office. And what we know about this guy is his name is Charlie Brown. That’s part of his identity. So I know this picture. I know his name. But if I dig a little deeper I can learn about his Social Security Number. His Social Security Number is 123-45 – whatever – and now I have more information about him. I can better identify Charlie Brown when I see him. But I also know that Charlie Brown made his debut on May 30, 1948. He was born a long time ago, but he’s still a nice kid. Now I have an even stronger identity for who he is. But the more I learn about Charlie Brown, the stronger that identity is.
So I can learn about his address. I know that he lives on James Street. He’s a boy. He’s a student. He’s got a sister named Sally. Lucy likes him, and he’s got his pet. That’s Snoopy. That’s a dog. It’s a beagle. So the point is here the more I learn about Charlie Brown, the stronger the concept of identity is, and the harder it is for somebody to masquerade as him. That’s the value of strong identity. The whole point is the more details you get the better you know that person. And think about your interpersonal relationships. The more you know a person the harder it is for somebody else to masquerade as that person. When that person is ill, or there’s something off – they may be upset or happy – the easier it is for you to detect those behavior patterns.
Now when it comes to people, we’ve also added technology to that strong identity. We have biometrics by phone that is, facial recognition. Many phones have thumb readers. And we do two-factor authentication for people when they want to interact with computer systems. This is what we want to do for applications – getting this rich set of data, doing effectively a biometric reading of the data and doing a multi-factor authentication on the network for the workloads in an era where network control becomes exceedingly difficult because of the sheer scale and sheer variety of stuff that we’re doing.
So what does this mean for applications? And here I’m going to go from a kind of theoretical approach and talk specifically about what Aporeto is doing to solve this problem. Operating Aporeto is quite simple and easy. Basically we start with the enforcer. The enforcer is part of our distributed security approach. It’s either a process or a container that co-resides with the application, wherever that application may be running – on your premise or in a public cloud. And what the enforcer does is it listens for events. When the new workload comes up it detects it and says, “Now there’s a workload.”
The next thing that the enforcer does automatically – it then goes and it interrogates known resources to get as much information to create a rich identity for that workload. So it goes to the orchestration engine and extracts various meta-data. It goes to the operating system where the application is running, extracts more meta-data. And if you’re running on the cloud it goes to the cloud identity document. And in case you have third party feeds from image scanners or any other ancillary system, they can provide even more meta-data. The enforcer goes and extracts all this and creates a set of key- value pairs that create the rich identity.
So we have a set of key- value pairs here. And the more we have – just like the Charlie Brown case – the better we know the application. And on top of it we inject a cryptographic signature – a nonce if you would – and sign the certificate so nobody else can masquerade as the workforce that we’re trying to protect and trying to identify. This is how we generate the rich identity for the application. Quite simple. It’s done automatically. It’s done on run time based on the known resources that we have. And the more we have the richer that the identity can be.
Then the next natural question that you would ask is, “Okay, now we have this identity. What does it mean for the protection of the application?” Let’s go to the next level. And we take a simple English sentence, “Good companies like Aporeto create customer value.” So the subject here is “good companies.” The object is “customer value,” and the verb is “create.” Subject is that which does something, object is what which is acted on, and verb is what we do. Here our verbs are quite simple – allow, read, encrypt, and so forth. And when we create policies we’re basically creating scopes. In this case we would do set operations based on key value pairs, based on the information that we know to know what subjects can do what to what objects.
So more specifically, if I want to have a very permissive policy that says, “Any application in my production environment can talk to itself,” this is the sentence I would write, “Env=prod connect to env=prod.” Anything within production can now connect with itself. If I want to have a more restrictive policy, I would write a sentence that says, “My payroll application and production can access confidential data and production, so long as that connection is encrypted.” And that is how Aporeto allows you to take that rich identity concept that you have for the application and apply it in your operations to protect you environment by allowing only the connections that you have prescribed in a whitelist setting.
Notice that there’s no dependency on the network. Basically the operation could be running everywhere. You could be running part of your applications on AWS, part on your premise, part on Azure. It doesn’t matter. That policy follows the application, no matter where it is. It’s based on application identity and nothing to do with the underlying infrastructure. Therefore you don’t have to do any of the complexities that go with network management in order to protect your application.
Now let me show you what this would look like in theory – well, not in theory but rather in practice. So this is an actual snapshot that I took this morning of all the different key value pairs that I have associated with one particular workload. In this case I took a picture of a Kubernetes workload, and we have various properties that we’ve gleaned from the system. I know what this enforcer ID is. I know that it’s a Docker container. I know what this identity is, and I have more meta-data. It’s an Nginx server in this case that I can read. And when I go to user tags you see that there are a bunch of vulnerability data that’s associated with it. All those orange tags are coming from an image scanner that we have attached to this workload. And on top of it we have user-generated tags that are there. For example, it’s TRT1. It’s our server role. It’s running on GCP – so forth and so on. So all those were user-generated tags that all got sucked in. Now it’s a combination of these things that are cryptographically signed that create the rich application identity for Aporeto.
The way it works in practice is that for us to authenticate an application connecting with another service, is we do an n-to-n authentication authorization, and we have the ability to transfer the encrypted traffic as well. So let’s say that you have two sets of services that are running around and have various attributes. And on the second tier what we have is a policy that says, “Accept anything that’s coming from an instances label as well.” So when this instance issues a request it sends, basically, a network packet. At this point the enforcer intercepts a network packet. It signs and attaches its identity attributes, and it forwards it through. And that forwarding – as long as the two services can talk to each other – that forwarding is taken care of.
And if the services cannot talk to each other, well, you have a bigger problem. The applications can’t connect. So basically we piggyback the identity on the network initiation request. And on the far side it validates the signature. The first question we ask is, “Is the incoming request signed by a source that we can trust?” If the answer is no we simply ignore the request, and we drop it. And the target never sees that request. And for this side then we do a policy look-up. Can the source talk to the destination? And if the answer is yes, then we allow that to go through. And correspondingly the destination response with a synap packet, and we do some metric operation. And as such, we have an n-to-n authentication authorization based on identity and the policy that we have defined.
And at this point you may have to find a policy that the connection may be encrypted. So in these cases Aporeto behaves like a proxy and stays inline and encrypts all traffic transparent to the application so the developer doesn’t have to worry about building any kind of SSL libraries or TLS libraries or key rotation or certificate management and any things that are generally very error prone. All that burden is taken away from developer and provides excellent security for a company that is now running in a hybrid environment or in a multi-cloud setting.
So here I want to point out that this is an order one operation, going back to a few slides. This is not an n2 operation. It’s a linear operation. So the bigger you scale, the more we scale with it, and the better your applications perform. Let me bring this back into a particular use case. This is an actual case study. The reference customer is exact, and they have deployed a credit card transaction application on the Google Cloud. And before Aporeto what they had done was encrypt the traffic between different availability zones. In this case I’ve labeled them as east and west. They actually had three, but you can imagine the VPN tunnel running between them with gateways and north-south firewalls.
And within each availability zone there were two VPCs, and the VPCs were interconnected with each other with a software virtualization layer, with an SDN layer, and they had east-west firewalls there. They were also using vaults to protect assets, using other segmentation techniques like netting and access control lists. So that’s the complexity that they were dealing with before Aporeto. And using Aporeto’s strong application identity the environment turned out to be quite simple.
It flattened the network within each availability zone, so everything’s now inside one VPC. Aporeto effectively works as a force field around each application node, and that protects them using the mechanisms with identity and policy that I’ve described before. And the traffic between the availability zones is encrypted automatically by policy. So there’s no longer a need to set up a VPN tunnel and north-south firewalls. Aporeto takes care of that. So you can appreciate the operational simplicity that we bring to the picture as well as a better, stronger security that we enable with application identity.
So basically to wrap this up, I want to talk about the evolution of the data center and how that’s changing and how Aporeto will help you future-proof your business with our strong application identity and our security model. So what we have here on the left side is a traditional data center where you physically manage the infrastructure. You own it. The network is an L1 through L7 network that’s tied very directly to the infrastructure. And the security level – you use firewalls, SDNs, ACLs and so forth, and that’s tied to the network.
And on top you have your static and rigid applications that are n-tier. So that’s your traditional data center model. And what we have witnessed is sort of cloud phase one, where the cloud service provider owns the infrastructure. Your network – you no longer have to worry about the physical layer – L1, L2 – but you do manage L3 through L7, and that’s tied very directly to the infrastructure. On top – on the application layer – what we’re seeing is a transformation away from the rigid application of the n-tier application towards DevOps, microservices based solution.
And the application is getting disaggregated, but we still have the old security model that’s tied directly to the network. What we’re migrating towards is a very dynamic application environment. And by having an identity-based zero trust security layer we also abstract away the complexity that’s tied to the L3 through L7 network that we see in the middle of the page. So where the cloud is going – specifically multi-cloud, hybrid-cloud world is going – is a very dynamic application. And ultimately what you want to do is abstract away the infrastructure, the network and security complexities. And what Aporeto does is helps you migrate from cloud phase one to a cloud-native environment. And that’s a value that we have with strong application identity.
So with that I want to leave you with a couple of links. If you’re interested in learning more about application identity you can go to our website, Aporeto.com, and download the white paper on this topic. It’s application_id_techpaper. Feel free to read that at your leisure. And also you can try our solution, which is by going to console.aporeto.com. You can ask for a trial account, and we’ll be very happy to work with you and to show you how to provide better security for your applications in a cloud-native environment. And if you have any questions you may direct them to me. My email is firstname.lastname@example.org. I’ll be happy to field any questions that you may have via email. But I also see that some questions are coming through, so bear with me as I review those questions.
Okay. So the first question is, “How easy it is to operationalize Aporeto?” And the answer is actually quite easy. We have a simple operational model, and luckily I actually have a slide where I can talk about this. Let me find it. Here we go. So it’s quite easy to get started with Aporeto. Basically what we do is first we visualize your environment. By plugging the Aporeto enforcer in your operational environment we allow you to visualize your applications and see what’s going on.
Second, we simulate security and allow you to understand what the security model would be like. And third, once you’re satisfied with the simulation, then you can enforce it for a particular application. And beyond that, once you have confidence in how to secure a particular application, you can kind of rinse and repeat and expand your operations. And the installation process is quite easy. It’s a two-command line. And it’s basically downloading and pseudo-installing it, and that agent then dials back to the Aporeto service and provides the visualization layer that you’d want to have. So it’s actually quite a simple operational model.
Now I see there’s another question that’s coming in. Bear with me. The question comes from Jeremy. He said, “You said something about learning about the app from the CICD pipeline. Are you integrating with some popular CICD tools?” So Jeremy, the answer is yes. Basically, we have an API first story. So everything that we do within the application, even at the UI layer, is an API call. And as long as the CICD platform provides restful calls we can integrate with those. So Jenkins is an example that we integrate with quite easily.
GitHub is what we use locally, and we integrate with that quite easily as well. Usually the model is that the developer takes Aporeto, effectively as a QA tool, to see what the connectivity of the application is. And that connectivity map then becomes the basis for policy. And the application itself then autogenerates that policy, so when the application gets promoted from dev to prod that policy can also get promoted as an application policy into production. I hope I answered your question.
There is another question that’s coming in. The question is from James. James is asking whether we interrupt production application operations after the application has been [out]. There the answer is we can go into what we call a discovery mode where we monitor the application, and we simply simulate security. And once you’re satisfied with a security simulation, and you believe that we can put Aporeto into the operational mode and on this slide you go from step two to step three. That’s when the enforcement begins. So you have the option of using Aporeto in an enforcement role – rather in a monitoring role and not enforced policy until you’re comfortable with it.
There is another question that’s coming in. And that’s also from Jeremy. He’s asking, “Is the app metadata static or continuously evolving and dynamic? If the latter are using analytics, etcetera.” So Jeremy, basically, when a container image goes up it’s immutable. So one the application’s operational it stays static. However, when a new container is born – let’s say that you pushed out a new container in your CICD practice, and then new metadata comes up.
At that point the application changes. What we do know is some metadata will change, but most of the metadata upon which policy is based stays constant. For example, roles, application names, the cloud identity and so forth. So at this point we don’t have adaptive policy that takes into account every operational tag, but it depends on the set of tags that stay relatively constant through the application lifeline. And if the application policy needs to change, that gets captured in the CICD pipeline in the steps that I outlined before where the developer visualizes the application, learns what the policy is and promotes a new policy into production.
So another question that’s come in, “Are you able to associate workloads to actual apps? For instance, my HR app may be running with hundreds of containers. Is there a concept of what is an app? Or are you working at one container goes with one app?” So the answer is, yes, we do map containers to an application. We depend on the orchestration layer. So when the orchestration layer orchestrates a particular application – as you mentioned – it may spawn hundreds of containers. All those hundred containers are then associated with a single app, and you have a per-container policy or you can have per-application policy or a mix thereof, and Aporeto is able to handle the multi-layered approach quite easily. The best way to see that is actually to sign up for a proof of concept and to see what Aporeto can do for you in action.
And I think that’s all the time that I have. My handler is telling me that we’re running out of time. So with that I’m grateful for your interest in Aporeto, and I hope to see more of your questions. Again, my email is email@example.com, and feel free to sign up for our service at console.aporeto.com. Thank you very much, and have a wonderful day.
[End of recorded material 00:30:55]