CosyVoice gRPC Server Insecure Deserialization Flaw CVE-2026-31251
CVE-2026-31251: CosyVoice gRPC server deserializes untrusted models via torch.load() without weights_only=True, enabling RCE via crafted .pt files. No patch confirmed.

Indicators of Compromise (1)
| Type ↑ | Value | Description | Conf | |
|---|---|---|---|---|
| SHA1 | 6e01309e01bc93bbeb83bdd996b1182a81aaf11e | Extracted from source material | high |
Executive Summary
A critical insecure deserialization vulnerability, tracked as CVE-2026-31251, has been disclosed in the CosyVoice text-to-speech (TTS) framework. The flaw resides in the gRPC server component, which loads speech synthesis models using PyTorch's torch.load() function without enabling the weights_only=True security parameter. An attacker who can supply a crafted model file (e.g., a .pt file) to the server can achieve remote code execution (RCE) with the privileges of the CosyVoice process. The vulnerability carries a CVSS v3.1 base score of 9.8, reflecting its network-exploitability, low attack complexity, and lack of required authentication. According to the National Vulnerability Database (NVD) entry, the flaw affects CosyVoice through commit 6e01309e01bc93bbeb83bdd996b1182a81aaf11e (dated 2025-30-21). At the time of publication, no official patch has been released by the project maintainers, and no workaround has been formally documented.
Technical Analysis
CosyVoice is an open-source, multi-lingual speech synthesis framework developed by FunAudioLLM (a project under the Alibaba Group). It provides a gRPC-based server that accepts requests to generate speech from text. The server loads pre-trained PyTorch model files (.pt or .pth) from a user-specified directory at startup.
The vulnerability is rooted in the server's use of torch.load() to deserialize these model files. By default, torch.load() allows the deserialization of arbitrary Python objects, including those that can execute code during the unpickling process. PyTorch introduced the weights_only=True parameter in version 1.10 (released October 2021) specifically to mitigate this class of attack. When set to True, the deserializer restricts the unpickled objects to a safe allowlist of tensors and serialization primitives, preventing the execution of arbitrary Python code.
According to the NVD description, the CosyVoice gRPC server fails to set weights_only=True when calling torch.load(). This omission means that any attacker who can deliver a maliciously crafted .pt file to the server can achieve arbitrary code execution. The attack vector is classified under CWE-502: Deserialization of Untrusted Data.
The exploitation scenario is straightforward: an attacker with network access to the gRPC server endpoint can send a request that causes the server to load a model file from a path under the attacker's control, or can replace a legitimate model file on disk if write access is obtained. Once the malicious file is deserialized, the embedded payload executes within the server's Python process. The NVD assessment notes that the attack complexity is low, no privileges are required, and no user interaction is needed beyond the initial server startup or model-loading operation.
It is important to note that the vulnerability is present in the gRPC server component specifically. Users who only interact with CosyVoice through its Python API or command-line inference scripts may not be exposed, though this has not been explicitly confirmed by the maintainers. The affected commit hash is 6e01309e01bc93bbeb83bdd996b1182a81aaf11e, which the NVD lists with a date of "2025-30-21" — likely a typo for 2025-03-21 or 2025-10-21, but the commit itself is publicly accessible.
Mitigations & Recommendations
As of May 12, 2026, no official patch has been released by the CosyVoice maintainers. Defenders who operate CosyVoice gRPC servers should take the following steps:
- Restrict network access to the gRPC server endpoint. Use firewall rules or network segmentation to limit connections to trusted clients only. The server should not be exposed to the internet or untrusted networks.
- Validate model file sources. Ensure that only model files from trusted, verified sources are loaded. Implement file integrity checks (e.g., cryptographic hashes or signatures) before loading any
.ptfile. - Monitor for anomalous model loading. Enable logging on the gRPC server to detect unexpected file paths or repeated load attempts. An attacker probing for the vulnerability may trigger multiple model-load requests.
- Apply a code-level workaround. If you have access to the CosyVoice source code, locate the
torch.load()call in the gRPC server handler and addweights_only=Trueto the invocation. This change is backward-compatible for standard model files and blocks deserialization of arbitrary Python objects. Test thoroughly before deploying to production. - Consider using an alternative TTS server that does not rely on
torch.load()for untrusted input, or wrap the CosyVoice server in a container with minimal privileges and a read-only filesystem for model directories.
Until a patch is available, any deployment of CosyVoice that loads models from user-supplied paths or untrusted network sources should be considered at high risk of compromise.
Stay Updated
Get the latest cybersecurity news delivered to your inbox.

