Security: RCE based on Insecure Deserialization via pickle in Checkpointer and Statistics Loading

## Description
A critical security vulnerability has been identified in `Dopamine` data loading and checkpointing modules. The framework utilizes the `pickle` module to deserialize training statistics and agent states from files accessed through `tf.io.gfile`.

The vulnerability stems from the fact that `tf.io.gfile` provides an abstraction layer that natively supports remote URI schemes (e.g., `gs://`, `s3://`, and UNC paths). An attacker can exploit this by:
1. Providing a malicious remote path to the `load_statistics` function.
2. Hijacking the `Checkpointer.base_directory` via Gin-Config bindings (`--gin_bindings`) to point to a remote malicious repository.

In both cases, the application will download and deserialize an untrusted `pickle` payload, resulting in **Remote Code Execution (RCE)** on the training node or research workstation.

## Environment Information
- **Dopamine Version**: 2.0 (Pre-Keras release)
- **Frameworks**: TensorFlow / JAX
- **OS**: Platform Independent (Reproduced on Windows and Linux)

## Reproduction Summary
The vulnerability was verified by hosting a malicious `pickle` payload on a remote share (SMB/Samba) and a local virtual drive. By injecting the remote path into either the `Checkpointer` or the Colab utility functions, the framework successfully executed the payload during the state restoration process.

***(Detailed proof-of-concept and technical reproduction vectors have been submitted to the Google VRP for private review).***

## Impact
- **Severity**: High / Critical
- **Technical Impact**: Remote Code Execution (RCE).
- **Business Impact**: Unauthorized hijacking of high-cost compute resources (GPUs/TPUs), theft of proprietary model weights/research data, and potential lateral movement within training clusters or cloud VPCs.

## Proposed Remediation
We propose a transition away from `pickle` for all persistence tasks.
1. Use **JSON** or **CSV** for training statistics and logs. **Or replacement of `pickle` by `msgpack` package**.
2. Adopt **Protocol Buffers (Protobuf)** or **Safetensors** for agent state and checkpoint restoration.
3. Implement URI scheme validation in `Checkpointer` to ensure that remote paths are restricted to trusted laboratory buckets.

**Since the project has stated that it does not openly accept PRs, I am writing to inform you that I have a code refactor ready to be provided via Pull Request, and I look forward to your confirmation to share it immediately.**

<img width="927" height="499" alt="Image" src="https://github.com/user-attachments/assets/0618419d-df1b-4936-9ed1-2ebf6740fdc4" />

Best regards,
Joshua Provoste


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security: RCE based on Insecure Deserialization via pickle in Checkpointer and Statistics Loading #236

Description

Environment Information

Reproduction Summary

Impact

Proposed Remediation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Security: RCE based on Insecure Deserialization via pickle in Checkpointer and Statistics Loading #236

Description

Description

Environment Information

Reproduction Summary

Impact

Proposed Remediation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions