Tech Giants Reel as AI Token Costs Soar, Prompting Budget Cuts and a Return to Human Labor

Deep News06-01 17:51

In the AI era, what is the most expensive resource? Following the earlier frenzy for graphics cards and talent, it now appears to be computing power and the cost of AI tokens. Not only are ordinary companies forced to be frugal, but even tech giants are feeling the pinch. The concept of "Tokenmaxxing" has rapidly become a joke.

Aaron Levie, CEO of cloud service company Box, recently attended a dinner with numerous top corporate executives and found that the most discussed topic among business leaders was not macroeconomic issues, but their companies' AI token costs.

Levie's observation is not surprising. Uber's CTO, Praveen Neppalli Naga, admitted last month that after deploying Claude Code to about 5,000 engineers, the company burned through its entire annual AI budget in just four months, catching him off guard.

Uber management's intention was to promote the widespread adoption of AI tools among engineers, and from that perspective, they succeeded. Last month, Claude Code usage among Uber engineers soared to 95%. However, management failed to consider the other side: the API cost per engineer could range from $500 to $2,000 monthly.

Faced with staggering token bills, even a $150 billion company like Uber had to urgently implement strict tiered management, limiting employee usage and meticulously budgeting for each token as carefully as they once saved paper.

**Even Tech Giants Can't Afford the Burn**

Surprisingly, it's not just mid-sized companies like Uber tightening their AI budgets; even supercloud giants like Microsoft and Amazon are involved.

Late last year, Microsoft's core Experiences and Devices division—the engineering team responsible for Windows, Microsoft 365, Outlook, Teams, and Surface—pioneered the large-scale internal rollout of Claude Code.

While Microsoft publicly emphasized that AI usage was not part of employee performance reviews, an internal memo from development head Julia Liuson circulated, stating AI use was "no longer optional, but a core requirement for every role and level."

Yet, just six months later, this experiment was abruptly halted. The reason wasn't poor results, but that employees used it too enthusiastically and heavily, catching the IT department unprepared. This week, Microsoft began revoking most internal Claude Code licenses. Thousands of engineers were forced to migrate back to Microsoft's own GitHub Copilot CLI for cost control.

An even more startling case emerged from an Axios exclusive this week. A tech giant, having failed to set usage caps on employee Claude licenses, burned through $500 million in a single month. This isn't an abstract figure: $500 million equals the annual revenue of many mid-sized tech firms.

While the report didn't name the company, it is widely believed to be Amazon, which had previously aggressively promoted internal AI tools and incorporated developer AI usage into internal ranking metrics, requiring over 80% of developers to use AI weekly.

This考核 system quickly spawned a new term: Tokenmaxxing—referring to employees artificially generating meaningless AI consumption to inflate their ranking data. Amazon employees automated mundane tasks that didn't require AI, letting AI agents idle in the background just to boost their token consumption numbers.

Confronted with a monthly token bill reaching $500 million, Amazon executives finally realized employees were using AI heavily to climb the rankings, not to solve actual business problems.

Just last week, Amazon abandoned its AI usage metrics, instructing employees to use AI tools efficiently. One Amazon executive even warned staff not to use AI just for the sake of using it, marking a stark contrast to their previous stance.

A similar token-gaming farce unfolded at Meta Platforms, Inc.. Meta's Chief People Officer, Janelle Gale, explicitly announced that AI usage and business impact would be formally integrated into performance reviews starting in 2026, with top employees' bonus caps exceeding 200% of their annual salary.

Meta even mandated that employees install software on their work computers to track AI usage in detail while providing training data for AI automation. Meta's CTO, Andrew Bosworth, publicly stated the goal was to develop AI agents capable of autonomously performing work tasks.

Bosworth noted that his top engineers spent on AI tokens an amount equivalent to his own salary, citing this as evidence of multiplied productivity. Under this directive, token consumption quietly transformed from an engineering metric into a currency for career advancement.

An unofficial leaderboard called "Claudeonomics" even appeared internally at Meta, tracking the token consumption of roughly 85,000 users company-wide. Over a 30-day统计 window, the board showed total usage exceeding 60 trillion tokens. It was taken down紧急 two days after exposure due to "internal data being accessed externally."

**The Bill Brings a Sudden Reality Check**

This dramatic situation where companies can't afford the token burn seems to have reversed too quickly. Not long ago, tech firms were encouraging employees to fully transition to AI, offering unlimited token allowances.

During NVIDIA's GTC conference in March, CEO Jensen Huang publicly stated plans to provide every NVIDIA employee with a $500,000 annual token allowance. Huang even emphasized he would be "disappointed" if any employee didn't use up that $500,000 by year-end, as it would mean they weren't fully leveraging AI to reshape their work.

As tech giants raced to tell Wall Street their AI transformation stories, token consumption was used as a proxy for AI adoption depth. More tokens consumed supposedly proved deeper integration into workflows. Thus, consumption本身 became the goal, with productivity a mere byproduct.

But the reality is that tying AI usage to performance reviews only leads to meaningless token waste. According to estimates by US software intelligence platform Jellyfish, the cost of merging a pull request is just $0.28 with light AI use, but under heavy use, it can瞬间 skyrocket to $89.32. More tokens are consumed, but no more product is delivered—just a larger bill.

Tech media Axios accurately summarized the current corporate AI dilemma with one term: sticker shock. Companies once eager to embrace AI are collectively facing inflated IT costs,难以量化的 productivity gains, and growing employee skepticism.

Goldman Sachs' latest forecast suggests global token consumption could grow 24-fold by 2030, reaching 120 quadrillion per month, as businesses and consumers massively adopt AI agents. This is a dizzying number—but the number itself creates no value. Research firm Mavvrik's survey shows 85% of corporate AI cost forecasts have errors exceeding 10%; 84% of companies say AI spending has compressed their gross margins by over 6 percentage points.

Gartner's forecast is彻底清醒: even if AI inference costs fall by nearly 90% by 2030, the total corporate AI bill won't get cheaper. The reason is simple—agent workflows consume hundreds to thousands of times more tokens than普通 single对话; the growth rate in consumption is足以 to淹没 any红利 from price drops.

"Chief product officers shouldn't mistake commodity token deflation for the democratization of cutting-edge inference capabilities," warned Gartner senior analyst Will Sommer.

Companies that once operated on a "turn on AI and see what happens" strategy are now playing catch-up—implementing the governance and cost-control mechanisms that should have been established before推广.

Token bills are forcing companies to重新思考 a most basic question: What problem is AI actually solving for me? This question should have been answered before purchasing licenses. Now, the market is answering it for them with the bill.

Uber COO Andrew Macdonald was blunt: "AI costs are becoming increasingly difficult to justify." This forms a沉重呼应 with his CTO's earlier statement: Uber's CEO even publicly stated he sees no clear correlation between extreme token consumption and the actual delivery of valuable products.

**A Return to Hiring Human Staff**

While many companies are implementing layoffs citing "AI can replace humans," the sharp rise in token costs and the inefficiency of AI returns have prompted some to hit the brakes.

Swedish fintech giant Klarna was once the most aggressive proponent of AI replacing human labor. In 2024, CEO Sebastian Siemiatkowski claimed AI "could already perform all human jobs," using this to freeze hiring and cut about 1,200 employees, reducing headcount from 5,500 in 2022 to 3,400.

However, by early 2026, Klarna began quietly reversing course: restarting recruitment for human customer service agents and admitting a mistake—"cost unfortunately became too dominant an evaluation factor, resulting in a decline in service quality." AI is fast, but it's not good at handling angry customers, complex disputes, or scenarios requiring empathy. Klarna's costly U-turn serves as a warning to the entire industry.

The case of Commonwealth Bank of Australia (CBA) is equally典型. In July 2025, Australia's largest bank announced replacing 45客服 employees with AI voice bots, reasoning that AI could handle大量 simple calls automatically. Reality quickly proved otherwise: after the bots launched, call volumes increased instead of falling, customer complaints surged, remaining staff were forced into大量 overtime, and management even personally took客服热线 calls. Just one month later, CBA publicly apologized to the laid-off workers, paid compensation,撤回 the layoff decision, and invited all 45 back. The Australian Finance Sector Union called this "a massive victory."

These two incidents reflect a structural trap in the AI era that hasn't been fully正视: when planning AI替代方案, companies often only calculate "how much labor cost is saved," but fail to account for the remediation costs when AI fails—customer流失, brand damage,重新招募 and training, and compensation payments to employees.

A joint survey by research firms Orgvue and Forrester shows that among companies急于 to replace人力 with AI, 55% expressed后悔 afterward. Tokens can be cheap, but the hidden cost of replacing people with tokens is often far more expensive than the数字 on the bill.

This reversal has come faster than anyone expected. From Jensen Huang announcing on the GTC stage, "I'd be disappointed if employees don't burn through $500K in tokens," to Amazon executives告诫 employees, "Don't use AI just for the sake of using it," only a few months have passed. Silicon Valley's attitude towards tokens has completed a 180-degree arc.

This doesn't mean the AI era is over; it means it's truly beginning. The previous phase of "indiscriminate推广" was essentially an organizational experiment paid for with wasted computing power—companies footing the bill with real money, employees陪跑 with their time, and customers背书 with their trust. Now the bill is on the table, and it's finally time to算清楚: which scenarios truly need AI, and which are just表演 for the leaderboard.

That Gartner statement is worth posting on every decision-maker's office wall: Don't mistake token price drops for the democratization of AI capability. The real question was never whether tokens are expensive, but for whom each consumed token is actually creating value.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Tech Giants Reel as AI Token Costs Soar, Prompting Budget Cuts and a Return to Human Labor

Comments