In this blog post, we take the concepts from Climbing The Ladder | Kubernetes Privilege Escalation (Part 1), which examined privilege escalation in Kubernetes environments and the danger of system pods, and now take a deep dive by analyzing an explicit use case.
Part 2 of this series explores how a chain of misconfigurations in Google’s GKE System Pods constitutes a vulnerability (GCP-2023-047) and how an attacker could chain them together to escalate privileges, compromise critical resources, become cluster admin, and take control of an entire Kubernetes cluster.
The Use Case of GCP-2023-047 | A SentinelOne Perspective
The use case of GCP-2023-047, found by the author of this blog, highlights a vulnerability within Google Kubernetes Engine (GKE) that stems from a chain of default misconfigurations in System Pods comprised of:
- A FluentBit DaemonSet misconfiguration
- Excessive permissions in an Anthos DaemonSet, and
- Overly-permissive service accounts.
The security bulletin explains that, “An attacker who has compromised the Fluent Bit logging container could combine that access with high privileges required by Cloud Service Mesh (on clusters that have enabled it) to escalate privileges in the cluster.” Individually, this chain of default misconfigurations might seem minor, but when combined, they offer a path for attackers to escalate privileges and gain control over the entire Kubernetes cluster.
Issue 1: FluentBit DaemonSet Misconfiguration
As illustrated in Figure 1, logging system pods like FluentBit are designed to collect and aggregate data from various pods across the Kubernetes cluster. This necessitates some level of access to each pod. FluentBit, the default logging agent in GKE, is deployed as a DaemonSet across all nodes, running with a configuration that inadvertently exposes sensitive pod tokens.
By mounting the /var/lib/kubelet/pods
volume, it gains access to the kube-api-access
directory, which contains service account tokens crucial for Kubernetes API interactions. Although FluentBit does not need direct API access, this setup exposes the cluster to significant risk.
Issue 2: Anthos Service Mesh (ASM) CNI DaemonSet Excessive Permissions
Anthos Service Mesh (ASM), Google’s managed implementation of Istio, manages inter-service communications within GKE. Its CNI DaemonSet named Istio-cni-node
, responsible for installing and configuring the Istio CNI plugin, is initially granted elevated permissions, including specific RBAC privileges. Post-installation, these heightened permissions persist unnecessarily.
Issue 3: Preinstalled, Highly Privileged Service Accounts
Within the kube-system
namespace, GKE houses several preinstalled service accounts endowed with significant privileges. Notably, the clusterrole-aggregation-controller
service account possesses the capability to modify cluster roles. An attacker accessing this account could adjust its associated roles, escalating their privileges to cluster-admin levels.
Exploitation Scenario
The below scenario outlines a multi-step attack on a Kubernetes cluster, starting with the attacker compromising a FluentBot pod, before eventually elevating cluster-admin permissions to take full control of the cluster and achieving complete compromise.
- Initial Compromise – An attacker gains access to a FluentBit pod, possibly through application vulnerabilities or misconfigurations.
- Token Harvesting – Utilizing the pod’s mount of
/var/lib/kubelet/pods
, the attacker extracts service account tokens fromkube-api-access
directories. - API Server Access – With these tokens, the attacker interacts with the Kubernetes API server, impersonating privileged service accounts.
- Exploiting ASM CNI DaemonSet – Leveraging the excessive permissions of the service account that’s associated with the ASM CNI DaemonSet, the attacker will deploy a new Pod within the kube-system namespace.
- Privilege Escalation – As shown in Figure 3, the attacker targets the
clusterrole-aggregation-controller
service account and binds it to a newly created pod. The attacker then uses theclusterrole-aggregation-controller
permissions to escalate its own privileges tocluster-admin
, enabling them to perform any operation across all namespaces within the cluster. - Full Cluster Compromise – Back to step 2, the attacker can now extract the service account token of the new pod from
kube-api-access
directories and use it to impersonate and operate as cluster admin.
Summary
It is crucial to move beyond theoretical examinations of techniques and understand how an attack might actually unfold. In this blog, we have explored GCP-2023-047 and how a combination of GCP misconfigurations and excessive privileges could be used to perform sophisticated privilege escalation and result in control of entire clusters.
Attacks like these underscore the need for both proactive and reactive security controls for Kubernetes environments. SentinelOne Cloud Security helps organizations secure their containerized applications by providing the full range of security controls needed, including Container and Kubernetes Security and Container Runtime Security.
Next Up
Lateral traversal (aka lateral movement) is a tactic used by threat actors to move from one system or environment to another. It often goes unnoticed as the activities blend in with normal operations, making it a critical activity to identify and prevent sophisticated cyberattacks.
In our next post within this blog series, we dive into how threat actors move beyond their initial landing into an environment and traverse laterally toward high-value resources using a real-world AWS Lambda example.