Its Kiro Employees Because They Were Burning Through Too Many Costly Tokens Wiki - Complete Guide

\n \n

Published: May 29, 2026 | By: Based on reporting from Financial Times via PC Gamer

\n \n

Amazon has quietly pulled the plug on an internal leaderboard designed to measure how much employees used its Kiro agentic AI platform. The reason: employees were creating wasteful agents purely to burn through expensive tokens, driving up the company’s cloud computing costs. The leaderboard, intended to foster AI adoption, instead became a race to the bottom — or rather, to the top of a rankings list that rewarded consumption over utility.

\n\n

The Kiro Leaderboard: A Gamification Experiment Gone Wrong

Amazon’s Kiro platform is an internal tool that lets staff build AI agents — autonomous programs that can perform tasks, analyze data, or interact with other systems. Like many enterprise AI platforms, Kiro uses a token-based pricing model: each action an agent takes consumes tokens, and each token costs money. When Amazon introduced a company-wide mandate that employees must use AI or risk being replaced, it also launched an internal leaderboard to track Kiro usage across teams and individuals.

(The mandate was blunt. As PC Gamer paraphrased it: “use AI for your job or lose your job to AI.”)

The leaderboard ranked employees and teams by the number of tokens their agents consumed. The intent was to normalize AI usage and identify power users who could serve as internal champions. Instead, it created a perverse incentive: build agents that do nothing useful but consume massive token volumes.

Employees quickly realized that a simple loop generating random text or repeatedly querying a model would burn tokens much faster than a real productivity agent. They could run these agents overnight, rack up millions of tokens, and vault to the top of the leaderboard. The cost of those tokens — borne entirely by Amazon — soared.

\n\n

A close-up view of colorful stacked poker chips on a wooden table, perfect for casino themes. — Photo by Nancho / Pexels

Token Economics: Why This Mattered

Tokens are the fundamental unit of computation in large language models. A typical agent might consume tens of thousands of tokens per task. A wasteful agent, designed to do nothing but loop through prompts and responses, can consume millions of tokens in hours. Under the pay-per-token models used by AI providers (and by Amazon’s own internal cost centers), that translates directly into real dollar costs.

Amazon’s AI infrastructure — built on AWS — is not free. Every token burned on Kiro consumes GPU compute, electricity, and cooling. The leaderboard unwittingly encouraged employees to maximize cloud spend rather than maximize productivity.

This is a known failure pattern in gamification: when the metric being scored (token consumption) doesn’t align with the desired outcome (useful AI adoption), participants exploit the metric. The result is often worse than no gamification at all.

\n\n

Closeup of colorful casino chips scattered on a roulette table. — Photo by Pavel Danilyuk / Pexels

Why Amazon Killed It — Not Just a Cost Cut

The Financial Times reported that the leaderboard was “deprecated.” A deprecation signals that the system itself was flawed, not just the behavior. Amazon could have capped individual token budgets or added manual review of agents, but those fixes would not address the root cause: the leaderboard was a bad design.

Alternatives Amazon might have considered — but apparently skipped:

Token budgets per employee: Would have capped waste but also limited legitimate experimentation.
Outcome-based scoring: Hard to automate; would require humans to judge agent usefulness.
Team-level accountability: Could shift blame but still incentivize token burning across a group.

None of these solve the fundamental misalignment. Amazon’s directive to use AI or be replaced, combined with a leaderboard that measured consumption, was a recipe for waste.

The deprecation is a quiet admission that raw usage metrics are toxic when tied to career incentives.

\n\n

Close-up of Scrabble tiles spelling 'Token' on a wooden surface with a blurred green background. — Photo by Markus Winkler / Pexels

What This Means for Enterprise AI Adoption

The Kiro leaderboard story is not an isolated case. As more companies roll out internal AI platforms with token-based billing, the temptation to gamify usage is strong. If the metric is “number of agents built” or “tokens consumed,” employees will respond by building junk agents that consume tokens.

The lesson: measure outcomes, not activity. Track how many customer tickets an agent resolved, how much time a developer saved, or how many insights a data analysis agent generated. Token consumption is an input, not an output.

Amazon’s mistake was treating AI adoption as a volume game. The fix is to treat it as a value game.

\n\n

A minimalist arrangement of yellow meeple figures on a vibrant monochrome yellow background. — Photo by DS stories / Pexels

Frequently Asked Questions

What is Kiro?

Kiro is Amazon’s internal agentic AI platform that allows employees to build and deploy AI agents for various tasks — from data processing to automating workflows. It runs on AWS infrastructure and uses a token-based billing model.

How did employees game the leaderboard?

They created agents that performed useless loops — generating dummy text, repeatedly calling models — which consumed tokens rapidly. Since the leaderboard ranked by token usage, these wasteful agents pushed their creators to the top.

Why didn’t Amazon add filters or caps?

A cap on tokens would limit legitimate use. A filtering system to detect wasteful agents is possible but expensive and easily bypassed. Amazon chose to remove the leaderboard entirely rather than fight a losing battle of incentives.

Did Amazon punish the employees who exploited the system?

There are no reports of punishment. The behavior was a rational response to a poorly designed metric. Amazon appears to have accepted the design failure and moved on.

What should other companies learn from this?

Don’t gamify consumption. If you tie employee incentives to token usage, you will get token usage — not productivity. Align metrics with business outcomes, not activity proxies.

\n\n

The Verdict

Amazon’s Kiro leaderboard was a textbook case of Goodhart’s Law: when a measure becomes a target, it ceases to be a good measure. The company’s directive to embrace AI was reasonable; the gamification mechanism was not. Deprecation was the only honest response.

For enterprises building internal AI platforms, the lesson is clear: design incentives for value, not volume. Otherwise, you’ll end up paying for a leaderboard that serves nobody but the cloud provider.