Silicon Valley Hit by ‘Token Maxing’ Costs

Global Trend

Silicon Valley Hit by ‘Token Maxing’ Costs

Dong-A Ilbo | Updated 2026.04.13

‘Token maximization’ emerges as a Silicon Valley trend
As the agentic AI era arrives, adoption surges
Rising costs spur companies to track employee usage
Chip and software players race to maximize token efficiency

According to the New York Times, a recent internal report at OpenAI showed that one engineer used 210 billion tokens (the unit of data that an AI model processes and generates) over the course of a week, ranking first in the company’s token consumption chart. That volume is enough to fill the entire online encyclopedia Wikipedia 33 times. At Anthropic, an AI startup, it also became a topic of discussion when it was revealed that a single user of its coding program “Claude Code” consumed USD 150,000 (about KRW 200 million) worth of tokens in just one month.

In this way, in the global information technology (IT) industry, including Silicon Valley in the United States, bizarre situations are emerging in which an individual’s AI usage far exceeds the annual IT infrastructure maintenance costs of many startups. As the era of “agentic AI,” where AI continues to work autonomously even while the user is asleep, has arrived, the amount of AI that an individual can use—namely, the number of tokens—has increased exponentially. In the US big tech industry, as token throughput is increasingly regarded as a kind of performance metric indicating an “engineer who uses AI well,” the term “tokenmaxxing” (maximizing token use) has been coined to describe this race.

As the tokenmaxxing trend drives corporate AI usage fees to astronomical levels, even the semiconductor industry is entering a competition to optimize token costs.

● Tokenmaxxing craze sweeps Silicon Valley

A token is a fragment of information (intelligence) created by breaking text into the smallest meaningful units so that AI can process massive amounts of data and understand language. For example, the sentence “토큰 경제란 무엇인가” (“What is token economics?”) is split into three tokens—“토큰” (token), “경제란” (economics), and “무엇인가” (what is)—and converted into numbers. The output generated through inference is handled in the same way. Just as a car burns fuel, AI relentlessly consumes these tokens, its “digital fuel,” throughout the entire process from command analysis to inference.

With the advent of the era of tirelessly working agentic AI, tokenmaxxing has emerged as a defining trend in Silicon Valley. Because tokens are consumed whenever AI ingests materials, produces answers, or writes code, the amount of token consumption has become a direct indicator of how actively AI is being used.

In the US tech sector, there is a widespread culture of treating massive token consumption as a badge of honor. Pushing AI usage to its limits has come to be regarded as a core competitive strength for both engineers and companies. OpenAI and Meta have even created internal “AI usage leaderboards” that track employees’ token consumption and encourage competition. Jensen Huang, CEO of Nvidia, further accelerated this trend last month when he announced that he would provide engineers with an additional “AI token budget” equivalent to half of their annual base salary.

The challenge for companies is that token consumption directly translates into costs payable to AI providers. This is the backdrop for the rise of “tokenomics,” the economic value structure of tokens, as a key industry theme.

It is in the same vein that CEO Huang emphasized last month that “future data centers will not be mere server storage spaces, but ‘token factories.’” Ultimately, token economics implies that efficient control of tokens will form a new industrial structure that determines corporate success or failure.

● Drive to improve “Toseongbi” amid ballooning bills

Startled by enormous cost invoices, companies have recently begun efforts to overhaul their structures. For small and medium-sized startups with limited capital and an urgent need to improve profitability, so‑called “Toseongbi” (token + gasseongbi, meaning token cost-effectiveness) has become increasingly important. Rather than wasting tokens indiscriminately, the strategy is to extract maximum operational efficiency within finite resources.

According to the Wall Street Journal (WSJ), global AI automation platform Zapier recently introduced a dedicated internal dashboard that can track each employee’s token usage in real time. Unlike big tech companies, the purpose is not to showcase the volume of AI usage, but to assess token efficiency. Kumo AI, a US startup with about 60 employees, has also been individually tracking how many tokens each engineer consumes since the beginning of this year in an effort to control costs.

The situation is similar in Korea. Major IT service providers such as Samsung SDS, LG CNS, and SK C&C are treating the optimization of client token expenditures as the top priority in their enterprise AI deployments and are focusing intensely on building customized, lightweight infrastructures. According to market research firm IDC, the number of agentic AI systems operating worldwide is expected to exceed 1 billion by 2029, which is 40 times the level in 2025. Agentic AI is projected to perform more than 217 billion tasks per day, and the annual cost of token transmission to support this is forecast to easily surpass USD 6.8 billion (about KRW 101 trillion).

Semiconductor companies are also focusing on developing inference chips that optimize token costs. This is one of the reasons behind Nvidia’s acquisition of inference chip company Groq.

Kim Doo-hyun, a professor in the Department of Computer Engineering at Konkuk University, said, “This is a point in time when sophisticated token management strategies are required, such as combining optimization algorithms that maximize text processing efficiency with small language models (sLMs) specialized in particular industry domains.” Oh Hak-joo, a professor in the Department of Computer Science at Korea University, said, “Finding the optimal balance between token productivity and costs is the key survival task in the era of autonomous AI.”

Jeon Hye-jin; Kim Jae-hyung

AI-translated with ChatGPT. Provided as is; original Korean text prevails.

LIST

DBR의 교육솔루션

Silicon Valley Hit by ‘Token Maxing’ Costs