The IT world is changing rapidly as containers and Kubernetes (K8s) become increasingly popular. In just seven years, we’ve moved from a virtual machine to containers and then to a container orchestration platform (the first Docker release launched in 2013). While some startups are still in the process of learning how these new resources can serve them, some of the more senior companies are looking into migrating their legacy systems to more efficient infrastructures.
While the rapid adoption of containers and Kubernetes shows just how disruptive these technologies have been, they have also led to new security problems. Their widespread popularity and the many organizations without proper security measures in place have made containerization and Kubernetes the perfect target for attackers.
A K8s cluster is a set of machines managed by a master node (and its replicas). It can span over thousands of machines and services and can thus become a prime attack vector. Adopting strict security practices is therefore crucial.
Securing Your Cluster
There are many moving parts within the Kubernetes cluster that must be properly secured. The security of the cluster, of course, cannot be achieved in a single process. Rather, ensuring the security of the entire cluster involves a number of best practices and requires a competent security team.
Below, we’ll cover a number of different Kubernetes attack vectors along with best practices for keeping your K8s cluster secure.
Ensuring Kubernetes and Its Nodes Are Up to Date
K8s is an open-source system that is continuously updated. Its GitHub repository is one of the platform’s most active repositories. As such, new features, refinements, and security updates are constantly being introduced.
Every four months, a new major Kubernetes version is released. Each new version includes new features to improve the service, but that may also introduce new security issues or bugs—something every software is susceptible to, especially if frequently updated.
Security breaches can be found in older versions too, however. Understanding how the Kubernetes team handles security updates in older versions is therefore critical. Unlike Linux distribution or other platforms, Kubernetes does not have an LTS version; rather, the Kubernetes system attempts to backport security issues to the three most recent major versions launched.
It is therefore vital to keep your cluster in one of the three most recent major versions, to keep on top of security patches, and to plan updates to the latest major version at least every twelve months.
Beyond its main components, Kubernetes also handles nodes that run the workload assigned to the cluster. These nodes can be physical or virtual machines with an operating system running on them. Each node is a potential attack vector that must be updated to address any security issues. The nodes must therefore be as clean as possible to reduce the attack surface.
Limit User Access
Role-based access control (RBAC) is one of the best ways to control who and how users have access to the cluster. It allows a fine-grained permission set to define each user’s permission. The rules are always additive, so any permission must be explicitly set. With RBAC, it is possible to restrict access permissions (view, read, or write) to each Kubernetes object, from pods (the smallest K8s computing unity) to namespaces.
RBAC can also be attached to another directory service using OpenID connect tokens. This allows users and group management to be defined in a centralized way to be used more widely within the organization.
Access permission is not only restricted to Kubernetes. Sometimes, users may need to access a cluster node to identify problems, for example. In such cases, it is better to create temporary users for solving these problems and then deleting them.
Best Practices for Containers
Docker, the most prominent container technology, is made up of layers: the innermost layer is the most primitive structure, while the outer layer is the most specific. Thus, all Docker images begin with some type of distribution or language support, with each new layer adding or modifying the previous functionality until the very last layer. The container should then have everything it requires to spin up the application.
These layers (also called images) may be available publicly in Docker Hub or privately in another image registry. The image can be expressed in two forms: as a name plus a label (e.g., node:latest) or with its immutable SHA identifier (e.g., sha256:d64072a554283e64e1bfeb1bb457b7b293b6cd5bb61504afaa3bdd5da2a7bc4b
for the same image at the moment of writing).
The image associated with the label can be changed at any time by the repository owner; thus, the latest tag indicates the latest version available. It also means that when building a new image or running an image with a tag, the inner layer can change suddenly, without any notice.
This strategy of course poses some problems: (1) You lose control of what is running in your Kubernetes instance, as an upper layer can be updated and add a conflict, or (2) the image can be intentionally modified to introduce a security breach.
To prevent the first issue, avoid using the latest tag, and opt for a more version-specific tag (e.g., node:14.5.0). And to avoid this second problem, opt for official images, clone the image to your private repository, or use the SHA value.
Another approach is using a vulnerability detection tool to continuously scan the images used. These tools can run together with continuous integration pipelines and can monitor the image repository to identify previously undetected issues.
When building a new image, it’s important to remember that each image should contain only one service. The entire image should be built so that it has only the dependencies needed for that application and nothing else. This reduces the attack surface to only the components essential to the service. Having only one application per image also makes it easier to update to a new version and to allocate resources in the orchestrator.
Network Security
The previous section was all about reducing the attack surface, and the same applies to networking. Kubernetes contains virtual networks inside the cluster that can restrict access between pods and allow external access so that only permitted services can be accessed. It is a primitive solution that works well in small clusters.
But bigger clusters that contain several services developed by different teams are far more complex, and a centralized approach may be impossible to manage. In such cases, service meshes are currently the best available method. The service mesh creates a network encryption layer that allows services to communicate with each other securely. They usually work as a sidecar agent that is attached to each pod and provides communication between services. Service meshes are not only about security; they also enable service discovery, monitoring/tracing/logging, and avoid service interruption by applying a circuit breaking pattern, for example.
Establishing Resource Quotas
Because applications are updated all the time, implementing the above means for securing your cluster are on their own insufficient, as there is still risk of a security breach.
Using resource quotas, in which Kubernetes limits outage coverage to the established constraints, is another important step. If the constraints are well designed, they will prevent all cluster services from becoming unavailable due to resource exhaustion.
They can also prevent you from racking up a massive cloud bill at the end of the month.
Monitoring and Logging
Monitoring the cluster, from cluster to pods, is essential for discovering outages and pinpointing their cause. It is all about detecting anomalous behavior. If the network traffic has increased or the nodes’ CPU is acting differently, this requires further investigation to rule out any issues. While monitoring is more about metrics such as CPU, memory, and networking, logging can provide additional (historical) information that can help detect unusual patterns or quickly identify the source of the problem.
Prometheus and Graphana combined are effective tools for Kubernetes monitoring. Prometheus is a highly performant time-series database, while Graphana is a graphical dashboard that can read Prometheus data and provide easy-to-view dashboards.
ElasticSearch is another useful tool and also one of the most popular for providing near real-time centralized logging of the application, nodes, and Kubernetes itself.
Cloud vs. On-Premises: The Security Perspective
A Kubernetes installation can be either on-premises or can use a cloud management service. In the on-premises scenario, every configuration—spinning up new machines, setting up networking, and securing the application—must be deployed manually. Cloud managed services such as Google GKE, AWS EKS, or Azure AKS enable K8s to be installed with minimal configuration and are compatible with other services from the cloud provider.
From a security perspective, on-premises solutions demand much more attention. As noted previously, every new update must be downloaded and configured by the system, and the nodes must be updated as well. It is therefore recommended that only an experienced team deploy on-premises Kubernetes.
With cloud management services, on the other hand, the process is far simpler, as Kubernetes is already installed and the cloud vendor keeps all nodes updated with the latest security features. From the cluster perspective, most cloud providers allow the user to choose the K8s version from a set and also provide ways to update it to a new version. And so, while it is more straightforward, there is also less flexibility.
Final Notes
With continuous updates and the flood of new tools on the market, staying up to date and keeping on top of vulnerabilities can be challenging. Breaches are inevitable. With Kubernetes, the challenge is even greater, as it is more than just a tool. Rather, Kubernetes is a set of tools that manages other tools, machines, and networks, and its security is therefore essential.
But with so many moving parts, keeping your Kubernetes secure is no trivial task, so be sure to follow these guidelines:
- Scan applications running on K8s for security issues.
- Limit and control access.
- Ensure everything is patched with the latest security updates and continuously monitor the cluster to address outages immediately to mitigate the damage.
The challenge is even greater with on-premises deployments, where there is real hardware to manage, automations to create, and more software to keep updated. But following the best practices discussed herein can give you a major security advantage and help keep your Kubernetes environment safe and running.
The SentinelOne Platform supports physical and virtual machines, Docker, self-managed Kubernetes, and cloud service provider managed Kubernetes like AWS EKS. To find out more, request a free demo today.