Summary of Key Points
Recently, many companies have made a big mistake in their AI transformations: they have used the “amount of Tokens consumed” as a key performance indicator (KPI) for their employees. As a result, they have lost control of their Token costs and wasted a significant amount of money unnecessarily (for example, one company spent $500 million in one month; Meta, Uber, and MiHuyou have all had similar incidents). Meanwhile, the upstream companies that provide AI models (such as Anthropic) and chip manufacturers (such as NVIDIA) have reaped huge profits from these expenses. Amazon was the first to realize this and removed the Token ranking system, replacing it with indicators that focus on actual output. This demonstrates that the key to improving AI efficiency is not just about increasing the number of Tokens used, but about integrating AI into business processes to solve real problems.
1. Out-of-Control Token Costs: The Fault of KPIs
Let’s start by explaining what Tokens are: you can think of them as the “fuel” for AI. Every time AI processes text or writes code, it consumes Tokens, and the more it uses, the higher the cost.
The main reason for the uncontrollable spending is the use of Tokens as KPIs:
- A company’s manager granted everyone access to the Claude tool without setting a limit, leading to employees repeatedly trying failed tasks and costing $500 million in one month;
- Meta created a “Claude Economics” ranking system, with the top user consuming 28.1 billion Tokens per month and spending nearly $500,000;
- Employees at MiHuyou used dozens of AI assistants for a project, costing 2 million RMB in one night;
- Uber provided AI tools to 5,000 engineers, which exhausted the entire budget for 2026 ahead of schedule.
Even worse, the AI assistant model itself is very resource-intensive: it requires multiple rounds of “thinking → researching → using tools → understanding context,” resulting in Token consumption that is 1,000 times higher than that of simple queries. It seems like the AI is working hard, but in reality, it’s just wasting a lot of money.
2. Upstream Companies Benefit: Your Costs Are Their Revenue
The Token costs of downstream companies have become a source of revenue for upstream providers:
- Anthropic: Revenue increased from $4.8 billion in the first quarter to an estimated $10.9 billion in the second quarter, thanks to businesses purchasing their APIs and using their Claude Code;
- NVIDIA: Revenue was $81.6 billion in the latest quarter, as AI推理 requires a large number of GPUs. The more Tokens are consumed, the greater the demand for GPUs, which is great news for NVIDIA’s profits.
Upstream companies often present increased Token consumption as a sign of advanced productivity, but for downstream companies, Tokens are a cost, not an asset. Only when these costs lead to more efficient business processes do they become valuable.
3. Amazon Removes the Ranking System: Changing KPIs to Avoid Formalism
Amazon previously had a “KiroRank” system that ranked engineers based on their Token consumption and required eight employees to use AI every week. As a result, employees started to unnecessarily use Tokens just to improve their rankings and feel more secure in their jobs.
Eventually, Amazon realized this and replaced the ranking system with a measure of “standardized deployment volume,” which focuses on whether engineers are producing useful code. This reflects Goodhart’s Law: when a KPI becomes a goal in itself, it loses its effectiveness (just as focusing on working hours was once effective, but now focusing on Tokens is just formalism).
Other companies have followed suit: Shopify changed their ranking system to a neutral dashboard with a “circuit breaker” mechanism to prevent excessive consumption; Duolingo removed AI-related evaluations; Microsoft has reduced the authorization of external AI tools.
4. True AI Efficiency Lies in Business Integration
Many companies misunderstand the significance of AI transformation. They think that creating accounts and using Tokens is enough for success, but that’s not the case:
- Uber found that while more code was written with AI, it didn’t provide users with more useful features;
- GitClear, a code analysis company, found that AI-assisted code had a 9-fold increase in rework rates and an 8-fold increase in duplicate code. In other words, AI has simply replaced human inefficiency with more expensive AI inefficiency.
True efficiency comes from integrating AI into business processes. Companies like OpenAI and Anthropic are hiring “forward-deployment engineers” to work closely with clients, understand their processes, manage permissions, and integrate AI solutions effectively.
In conclusion, wasting Tokens is not a sign of success. What really matters is showing how the use of AI has improved the company’s operations—whether it has shortened processes, reduced rework, or accelerated deliveries.
Final Summary
AI transformation is not about who spends the most money, but about who solves the most problems with AI. Don’t let your Token costs become just numbers in someone else’s financial reports.