Docling XXE Flaw CVE-2026-31248 Lets Attackers Trigger XML Bomb DoS
CVE-2026-31248: Docling METS GBS backend through 2.61.0 fails to disable entity resolution in etree.fromstring(), enabling XML Bomb attacks via crafted .tar.gz archives.

Executive Summary
A critical XML External Entity (XXE) vulnerability, tracked as CVE-2026-31248, affects the Docling document processing library's METS GBS backend through version 2.61.0. The flaw allows an unauthenticated attacker to trigger a denial-of-service (DoS) condition by submitting a specially crafted .tar.gz archive containing a malicious XML file with nested entity definitions — known as an XML Bomb or billion laughs attack. The vulnerability stems from the backend's use of Python's etree.fromstring() without disabling entity resolution, enabling exponential entity expansion that can exhaust system memory and CPU resources. Docling maintainers have acknowledged the issue; a patched version is expected in an upcoming release.
Technical Analysis
Docling is an open-source document conversion library developed by the Docling Project, designed to parse and transform various document formats including PDF, DOCX, and XML-based METS (Metadata Encoding and Transmission Standard) files. The METS GBS (Google Books Style) backend specifically processes XML metadata packaged inside .tar.gz archives.
According to the vulnerability disclosure published on the project's GitHub repository, the METS GBS backend extracts XML files from .tar.gz archives and validates them using Python's xml.etree.ElementTree.fromstring() method. By default, this parser resolves XML entities — a feature that, when left enabled, permits entity expansion attacks.
The XML Bomb technique, also known as the billion laughs attack, exploits recursive entity definitions. An attacker constructs an XML document where entities reference other entities in a nested chain, causing the parser to expand a small input (e.g., a few kilobytes) into gigabytes of in-memory data. For example, defining an entity &lol; that expands to &lol1;&lol1;... repeated exponentially can quickly exhaust available memory and CPU, crashing the application or rendering the host unresponsive.
An attacker can embed such a malicious XML file inside a .tar.gz archive and submit it to any service or application using Docling's METS GBS backend. Upon extraction and parsing, the etree.fromstring() call processes the entity definitions without restriction, triggering the exponential expansion. The vulnerability is present in all Docling versions up to and including 2.61.0.
No CVSS score has been officially assigned to CVE-2026-31248 as of this writing. However, based on the attack vector — network-delivered, low complexity, no authentication required — and the impact (availability loss via resource exhaustion), the severity is expected to be high, likely in the 7.5–8.0 range under CVSSv3.1. The attack does not lead to data exfiltration or code execution, limiting its scope to denial of service.
The issue was reported through the project's GitHub issue tracker. The disclosure includes a proof-of-concept demonstrating the XML Bomb payload and its effect on unpatched versions. The Docling maintainers have confirmed the finding and are working on a fix that disables entity resolution in the XML parser.
Mitigations & Recommendations
Until a patched version of Docling is released, organizations using the library — particularly those exposing the METS GBS backend to untrusted input — should implement the following mitigations:
- Disable entity resolution manually: If possible, modify the application code to use
etree.fromstring()withparser = etree.XMLParser(resolve_entities=False)instead of the default parser. This prevents entity expansion entirely. - Validate archive contents: Implement pre-processing checks on
.tar.gzarchives to reject files containing XML with entity definitions before they reach the parser. - Apply resource limits: Configure memory and CPU limits at the OS or container level (e.g., cgroups, ulimit) to contain the blast radius of any successful XML Bomb attack.
- Monitor for anomalous resource usage: Deploy monitoring rules that alert on sudden spikes in memory or CPU consumption in services that process document uploads.
Defenders should watch the Docling GitHub repository for the patched release and apply it promptly once available.
Stay Updated
Get the latest cybersecurity news delivered to your inbox.

