
Why this matters for AI infrastructure
The vulnerable inference servers form the backbone of many enterprise-grade AI stacks, processing sensitive prompts, model weights, and customer data. Oligo reported identifying thousands of exposed ZeroMQ sockets on the public internet, some tied to these inference clusters.
If exploited, an attacker could execute arbitrary code on GPU clusters, escalate privileges, exfiltrate model or customer data, or install GPU miners, turning an AI infrastructure asset into a liability.
SGLang has been adopted by several large enterprises, including xAI, AMD, Nvidia, Intel, LinkedIn, Cursor, Oracle Cloud, and Google Cloud, Lumelsky noted.
