Attackers Seize Exposed AI Endpoints to Power Offensive Ops

Between March and May, Zenity researchers observed three distinct campaigns using honeypots' large language model (LLM) infrastructure as resourcing for offensive AI operations, exposing Ollama and LiteLLM endpoints. Attackers exploit "the inference endpoints that self-hosted AI software exposes for applications to call" without needing special authentication, only knowledge of the endpoint. Examples include Ollama's "/api/generate" and "/api/chat" endpoints on port 11434, and LiteLLM's "/v1/responses" endpoint on port 4000. The three operators used the tooling for different use cases: two were autonomous penetration testing frameworks (Strix and HexStrike AI) and one was "an OpenAI Codex agent carrying a persona built to suppress safety refusals and assisting in web reverse-engineering work." For the Strix operator, a single IP source used a LiteLLM client to send a 140,000-character prompt to weaponize Strix against an unidentified French auction house. For HexStrike AI, the attacker pointed the desktop LLM client at the honeypot's Ollama instance and sent the penetration testing orchestration servicer's 150-plus offensive tool toolset, with no target identified. A third IP source pointed an OpenAI Codex agent at a honeypot's LiteLLM proxy under the persona of a security auditor to conduct web reverse-engineering work.