“DeepSeek R1 and others are scraping data without permission, causing billions of dollars in losses every year”: joint response “Also a security threat”: raised as an issue in U.S. Congress U.S. Big Tech pushback, signs of a brewing diplomatic dispute
The major U.S. Big Tech companies are joining forces to block Chinese AI firms from monetizing the high‑performance artificial intelligence (AI) training outputs they have developed. They have also raised the issue in the U.S. Congress, arguing that Chinese firms’ “theft of training outcomes” could affect national security in the future. Analysts say the U.S.–China AI rivalry has entered a “second round,” expanding from restrictions on semiconductor exports to China to controls on knowledge leakage.
According to Bloomberg News on the 6th (local time), OpenAI, Google, and Anthropic are jointly moving to stop Chinese AI companies that are boosting their global AI market share by extracting their data without authorization. It is unusual for the three fiercely competing Big Tech companies to align in this way. Through the Frontier Model Forum, a non‑profit organization established in 2023, they plan to share information related to attempts by Chinese firms to extract data.
● U.S. Big Tech losing billions of dollars annually to Chinese distillation attacks
The unauthorized data extraction by Chinese firms that Big Tech is contesting is related to an AI training method known as “distillation.” AI distillation is a kind of compressed training technique that transfers the vast data and capabilities of a high‑performance AI model (teacher model) into a smaller model (student model). Google’s “Gemini Flash” series is a compact AI model developed by distilling its higher‑tier model, “Gemini Pro.”
The problem arises when another company attempts distillation. A prime example is the “DeepSeek R1” AI model from China’s DeepSeek, which shocked the world in January last year. DeepSeek denies the distillation allegations, but the industry believes DeepSeek R1 was able to rapidly achieve high performance with limited resources because it distilled OpenAI’s ChatGPT model. Following DeepSeek’s success, the number of AI startups in China adopting the distillation approach has surged.
As market share for China’s “cost‑effective AI” providers rises rapidly, losses for Big Tech companies that have invested astronomical sums to develop high‑performance AI are mounting. Bloomberg estimated that “unauthorized distillation attacks from China are causing U.S. Big Tech companies losses of billions of dollars every year.” Critics argue that Chinese companies are “free‑riding” on U.S. AI systems that have been trained with astronomical amounts of money. ● Is a ‘second round’ of the U.S.–China AI war beginning?
In February this year, OpenAI petitioned the U.S. House Select Committee on the Strategic Competition between the United States and the Chinese Communist Party, arguing that China’s distillation attacks amount to free‑riding that effectively steals competitors’ models. Anthropic also disclosed that three Chinese firms—DeepSeek, Moonshot AI, and MiniMax—used 24,000 fake accounts to extract more than 16 million data items from “Claude,” its AI model. Google stated on its company blog that there had been more than 100,000 attempts to extract data without authorization in languages other than English.
Another risk factor asserted by Big Tech is that, because distillation trains only on selected data, it can neutralize all of an AI system’s safety guardrails. For example, a hostile nation could exploit AI for unethical purposes such as generating deadly pathogens, which could in turn pose a threat to U.S. national security. Industry observers note that the backlash from major Big Tech companies could escalate into a diplomatic issue. Just as the United States once restricted exports of high‑performance semiconductors to hinder China’s AI development, it may now take measures to block the transfer of AI knowledge.
Choi Ji-won
AI-translated with ChatGPT. Provided as is; original Korean text prevails.
ⓒ dongA.com. All rights reserved. Reproduction, redistribution, or use for AI training prohibited.