UBS Finds Enterprises Throttling AI Spending

According to a UBS research note dated June 23, 2026, analysts Karl Keirstead, Timothy Arcuri, and Taylor McGinnis reported that based on conversations with roughly a dozen enterprise IT executives, about 60% of enterprises were "in some manner throttling AI spend" by adding guardrails such as token pooling, model downgrading, waste alerts, and per-user usage limits. Reported examples include one company that exhausted a large share of its annual token budget and cut its internal AI tools from five to two, one user who ran up $35,000 in a single month on AWS Bedrock, and DevOps teams that consistently exceeded weekly token quotas by 100-200%. The analysts called the pattern a modest "emerging headwind" for AI model makers and said "token spend optimization has become a key issue in most organizations." The primary technical response is model routing, directing routine tasks to cheaper models. The note names open-source and Chinese models—including DeepSeek, Alibaba's Qwen, MiniMax, and Zhipu AI's GLM—as beneficiaries, with one large global bank reportedly deploying Qwen on-premises. A companion UBS survey of roughly 130 companies found only 8% have deployed AI agents at scale in production.