‘TurboQuant’ Memory 6x Compression Shocks Chip Market

AI Chip

‘TurboQuant’ Memory 6x Compression Shocks Chip Market

Dong-A Ilbo | Updated 2026.03.28

Panic sweeps through the memory chip market
“One can do the work of six HBM chips… If the source is opened, commercialization possible within the year”
“Once commercialized, memory use will actually increase… AI will spread faster and at lower cost”

《Memory stocks swing, what is Google’s ‘TurboQuant’…

Google has unveiled an algorithm called “TurboQuant” that can reduce artificial intelligence (AI) memory usage by up to one-sixth, sending shock waves through global stock markets. Views are divided over whether this will drive down the value of memory semiconductors or instead become a catalyst for an explosion in AI investment.》

Google has published an algorithm that drastically cuts the use of memory chips, stirring up markets and the semiconductor industry. The centerpiece is “TurboQuant,” an algorithm that dramatically reduces the memory space required for AI computation. Fears that demand for key memory semiconductors such as high bandwidth memory (HBM) will plunge have gripped the market. However, in the semiconductor industry and academia, there is also a counterview that this attempt to substitute massive hardware investment with software innovation could instead accelerate the popularization of AI and, in the long term, trigger an explosion in memory demand.

● “One can do what six HBMs used to do”

On the 24th (local time), when Google Research, an in-house research division at Google, introduced TurboQuant on its official blog, the market reacted immediately. Share prices of global memory semiconductor companies, from US-based Micron to South Korea’s Samsung Electronics and SK hynix, fell for a second consecutive day. The emergence of the low-cost, high-efficiency Chinese AI model DeepSeek early last year is being cited as a blueprint for the current situation, heightening investor anxiety.

TurboQuant is a quantization algorithm that radically shrinks the size of the “key-value (KV) cache,” where large language model (LLM) AIs temporarily store data so they do not forget previous context during long conversations. It rapidly breaks data into simpler chunks to reduce memory usage. Google stated that it compressed the size of the KV cache memory to about one-sixth of the existing level. It is comparable to using a “vacuum compression bag” to reduce the bulk of a thick winter comforter.

This has been interpreted to mean that one HBM can now handle what previously required six, thereby amplifying market unease. A semiconductor industry official likened it to “data that had been stuck in a single-lane road now rushing smoothly from HBM to the graphics processing unit (GPU) on a four-lane highway.” In fact, Google emphasized that when TurboQuant is applied, the computational performance of NVIDIA’s AI accelerator “H100” improves by up to eightfold.

There is also speculation that Google will present a formal paper along with immediately installable “open-source code” for TurboQuant at the International Conference on Learning Representations (ICLR 2026), a leading global AI academic conference to be held in Brazil on the 23rd of next month. The industry expects that, once the open source is released, commercialization could begin as early as the fourth quarter of this year (October–December).

● “Triggering AI popularization… bringing forward the memory boom”

Experts say TurboQuant is still only at the theoretical stage and needs to be watched. Kim Jung-ho, professor at the School of Electrical Engineering at KAIST, said, “It is too early to take Google’s claimed effects at face value, and the market is overreacting,” adding, “Rather than directly replacing HBM, it could bring changes to the way NAND flash-based storage is utilized.”

Even if TurboQuant is commercialized rapidly, the argument that it may in fact fuel a “memory boom” is gaining traction. Bloomberg News, citing Morgan Stanley and JPMorgan Chase, reported that TurboQuant’s development could give rise to “Jevons paradox” in the memory semiconductor sector. Jevons paradox refers to the phenomenon in which technological advances that improve the efficiency of a resource’s use actually increase demand for that resource and raise total consumption. In other words, as computational efficiency improves, AI services may proliferate more rapidly and model sizes may grow, ultimately increasing memory usage. The same theory was advanced during last year’s “DeepSeek shock.”

Han Jin-ho, a division leader at the Electronics and Telecommunications Research Institute (ETRI), forecast, “Rather than a decline in memory demand, it is more likely that the industry will evolve toward an increase in large-scale computation volumes based on high-performance computing.” Han In-soo, professor at the School of Electrical Engineering at KAIST, who participated in Google’s TurboQuant research, stated, “As AI shifts from a high-capacity focus to a high-efficiency focus, AI will become cheaper and spread more quickly, while demand for semiconductors is also expected to become more sophisticated in qualitative terms.”

Lee Dong-hoon;Lee Min-a;Choi Ji-won

AI-translated with ChatGPT. Provided as is; original Korean text prevails.

LIST

DBR의 교육솔루션

‘TurboQuant’ Memory 6x Compression Shocks Chip Market