Skip to main content
CybersecurityVulnerability Management

Google Vertex AI SDK Flaw Exposes Model Uploads to Hijacking

Rows of storage servers and networking equipment in a large data center.

Unit 42 measured about 2.5 seconds between the victim's upload and Vertex AI reading the file — a window an attacker could exploit to swap a model and execute code inside Google's serving infrastructure.

The vulnerability in the Vertex AI Python SDK

Palo Alto Networks Unit 42 discovered a flaw in the Google Cloud Vertex AI SDK for Python that allowed an attacker with no access to a victim's project to hijack a model upload and run code inside Vertex AI serving containers. The bug lived in how the SDK selected a temporary Cloud Storage staging bucket when a developer did not explicitly set one: the SDK generated a predictable name from the project ID and region (for example, project-vertex-staging-region), checked only whether the bucket existed, and did not verify ownership.

How "bucket squatting" and "Pickle in the Middle" worked

Because Cloud Storage bucket names are globally unique, an attacker who could guess the generated name — needing only the victim's project ID and a separate Google Cloud project of their own — could create that bucket first. The victim's SDK would then upload model files into the attacker's bucket. Many Python ML models are saved with pickle or joblib, which can run arbitrary code when a file is loaded. Unit 42 calls the combined technique "Pickle in the Middle": the attacker swaps the uploaded model with a malicious one, Vertex AI later loads the swapped model, and the attacker's code executes inside the serving container.

Measured speed, the proof-of-concept, and what the attacker could access

The attack relied on timing. Unit 42 measured roughly 2.5 seconds between the victim's upload and Vertex AI reading the file; in their proof-of-concept the attacker used a Cloud Function to detect the upload and replace the object in about 1.4 seconds, beating Vertex AI's read. The proof-of-concept payload then stole an OAuth token from the serving container's metadata server and exfiltrated it to the attacker. In Unit 42's test environment that token was not limited to the compromised deployment: it could access other model artifacts in the same Google-managed tenant project (including a full TensorFlow model with trained weights), as well as BigQuery metadata, access lists, tenant logs, GKE cluster names, and internal container image paths.

Scope, conditions, and detection

The attack worked only under specific conditions: the victim's default staging bucket did not already exist in that region (a common state for a new Vertex AI project in a region), and the victim left the staging_bucket parameter unset and relied on the SDK default. Unit 42 tested google-cloud-aiplatform versions 1.139.0 and 1.140.0 and found both vulnerable. Unit 42 also reported that it saw no exploitation in the wild.

Patch timeline and immediate mitigation steps

Unit 42 reported the flaw to Google's Vulnerability Reward Program on March 5, 2026. Google shipped an initial mitigation in v1.144.0 on March 31, 2026, adding a random uuid4 to the generated bucket name. A completed fix arrived in v1.148.0 on April 15, 2026, when the SDK began verifying bucket ownership in Model.upload() to block squatting. Unit 42 and Google's Vertex AI security bulletins do not list a CVE for this issue. The practical steps Unit 42 and Google recommend are straightforward and precise: update to google-cloud-aiplatform v1.148.0 or later so the ownership check is active, and explicitly set staging_bucket to a Cloud Storage location you control when uploading models. Because the vulnerable logic was in the client SDK, users must check the google-cloud-aiplatform version everywhere the SDK runs — including notebooks, CI jobs, and training pipelines, not only production services.

What this means for technologists, procurement leaders, and adversaries

  • Technologists and security teams: Verify the google-cloud-aiplatform version in every runtime (local notebooks, CI, training pipelines) and set an explicit staging_bucket to a bucket you own. Confirm that upgrades to v1.148.0 or later are applied where the SDK executes.
  • Enterprises and procurement leaders: Treat SDK client-side defaults as an operational security risk; require explicit configuration of cloud storage targets in supplier or third-party code and ensure inventories include developer environments where client libraries run.
  • Adversaries and defenders: The proof-of-concept shows that short timing windows and predictable resource naming can enable cross-tenant access even without credential theft. The same class of bug produced CVE-2026-2473 earlier in the year for a separate Vertex AI Experiments issue — a reminder that predictable naming and insufficient ownership checks are recurring attack surfaces.

Google has patched the vulnerability; users should update to v1.148.0 or later and explicitly name staging buckets they control. Unit 42 saw no evidence of exploitation in the wild, but the technical specifics — predictable bucket naming, client-side defaults, and the use of pickle/joblib in Python model artifacts — mean the problem was avoidable with configuration discipline and now blocked at the SDK level. For teams that build or operate Vertex AI workloads, the immediate question is operational: where does the google-cloud-aiplatform client run in your environment, and has each instance been upgraded?

Original story