CVE-2024-9070: BentoML Deserialization RCE Vulnerability

CVE-2024-9070 Overview

A critical insecure deserialization vulnerability has been discovered in BentoML's runner server affecting versions <=1.3.4.post1. This vulnerability allows unauthenticated remote attackers to execute arbitrary code on vulnerable servers by manipulating specific parameters in requests. The flaw is particularly dangerous as it can be exploited over the network without any user interaction, potentially leading to complete system compromise.

Critical Impact
Unauthenticated remote code execution through insecure deserialization enables attackers to gain full control of affected BentoML servers, potentially compromising machine learning infrastructure and sensitive data.

Affected Products

BentoML (bentoml/bentoml) versions <=1.3.4.post1
BentoML runner server component
Systems running BentoML with exposed runner server endpoints

Discovery Timeline

2025-03-20 - CVE CVE-2024-9070 published to NVD
2025-10-15 - Last updated in NVD database

Technical Details for CVE-2024-9070

Vulnerability Analysis

This vulnerability is classified as CWE-502: Deserialization of Untrusted Data. The BentoML runner server fails to properly validate and sanitize serialized data before processing, allowing attackers to inject malicious payloads that are automatically deserialized by the application.

The vulnerability is triggered when the args-number parameter is set to a value greater than 1. Under these conditions, the runner server automatically deserializes incoming data without proper validation, enabling arbitrary code execution on the target system. This is particularly concerning for machine learning deployment platforms like BentoML, which are often exposed to network traffic and handle sensitive model data.

The network-accessible attack vector combined with no required authentication or user interaction makes this vulnerability highly exploitable in production environments.

Root Cause

The root cause lies in the improper handling of serialized objects within BentoML's runner server. When processing requests with multiple arguments (args-number > 1), the server automatically deserializes input data without implementing proper security controls such as input validation, type whitelisting, or sandboxed deserialization. This design flaw allows attackers to craft malicious serialized payloads that execute arbitrary code when deserialized.

Attack Vector

The attack is network-based and can be executed remotely against any exposed BentoML runner server. An attacker crafts a specially formatted request to the runner server endpoint, setting the args-number parameter to a value greater than 1 and including a malicious serialized payload. When the server processes this request, it deserializes the payload without proper validation, triggering arbitrary code execution.

The attack requires no authentication and no user interaction, making it suitable for automated exploitation. Attackers can leverage this vulnerability to establish persistent access, exfiltrate sensitive machine learning models, or pivot to other systems within the network.

For detailed technical information about this vulnerability, refer to the Huntr Bounty Listing.

Detection Methods for CVE-2024-9070

Indicators of Compromise

Unusual network traffic to BentoML runner server endpoints with abnormal parameter values
Requests containing args-number parameter values greater than 1 with suspicious serialized data
Unexpected process spawning or command execution originating from BentoML server processes
Anomalous outbound network connections from systems running BentoML services

Detection Strategies

Monitor HTTP/HTTPS traffic to BentoML runner server endpoints for requests with args-number > 1
Implement application-level logging to capture and analyze all deserialization operations
Deploy network intrusion detection rules to identify serialized Python object patterns in request payloads
Audit BentoML server logs for unexpected errors or exceptions related to deserialization

Monitoring Recommendations

Enable detailed logging on all BentoML runner server instances
Implement real-time alerting for deserialization-related exceptions or errors
Monitor system processes for unexpected child processes spawned by BentoML services
Track resource utilization anomalies that may indicate post-exploitation activities

How to Mitigate CVE-2024-9070

Immediate Actions Required

Upgrade BentoML to a version newer than 1.3.4.post1 that contains the security fix
Restrict network access to BentoML runner server endpoints using firewall rules or network segmentation
Implement Web Application Firewall (WAF) rules to inspect and filter requests with potentially malicious serialized payloads
Audit exposed BentoML instances and disable any unnecessary runner server endpoints

Patch Information

Organizations should upgrade to the latest version of BentoML that addresses this vulnerability. Consult the official BentoML repository and release notes for the specific patched version. For detailed vulnerability information, see the Huntr Bounty Listing.

Workarounds

Place BentoML runner servers behind authenticated reverse proxies to prevent unauthenticated access
Implement network-level access controls to restrict runner server access to trusted IP addresses only
Deploy application-level input validation to reject requests with args-number values greater than 1 until patching is possible
Run BentoML services in isolated container environments with limited privileges to minimize impact of potential exploitation

bash

# Example: Restrict BentoML runner server access using iptables
# Allow only trusted internal network to access runner server port
iptables -A INPUT -p tcp --dport 3000 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 3000 -j DROP