*
Distillation violates terms of service of U.S. tech
companies
like OpenAI
*
Stopping distillation is challenging due to open-source
models
and detection difficulties
*
Commerce nominee Lutnick criticizes DeepSeek in
congressional
hearing
By Stephen Nellis, Krystal Hu, Jeffrey Dastin, Anna Tong and
Katie Paul
Jan 29 (Reuters) - Top White House advisers this week
expressed alarm that China's DeepSeek may have benefited from a
method that allegedly piggybacks off the advances of U.S. rivals
called "distillation."
The technique, which involves one AI system learning from
another AI system, may be difficult to stop, according to
executive and investor sources in Silicon Valley.
DeepSeek this month rocked the technology sector with a new
AI model that appeared to rival the capabilities of U.S. giants
like OpenAI, but at much lower cost. And the China-based company
gave away the code for free.
Some technologists believe that DeepSeek's model may have
learned from U.S. models to make some of its gains. The
distillation technique involves having an older, more
established and powerful AI model evaluate the quality of the
answers coming out of a newer model, effectively transferring
the older model's learnings.
That means the newer model can reap the benefits of the
massive investments of time and computing power that went into
building the initial model without the associated costs.
This form of distillation, which is different from how
most academic researchers previously used the word, is a common
technique used in the AI field. However, it is a violation of
the terms of service of some prominent models put out by U.S.
tech companies in recent years, including OpenAI.
The ChatGPT maker said that it knows of groups in China
actively working to replicate U.S. AI models via distillation
and is reviewing whether or not DeepSeek may have distilled its
models inappropriately, a spokesperson told Reuters.
Naveen Rao, vice president of AI at San Francisco-based
Databricks, which does not use the technique when terms of
service prohibit it, said that learning from rivals is "par for
the course" in the AI industry. Rao likened this to how
automakers will buy and then examine one another's engines.
"To be completely fair, this happens in every scenario.
Competition is a real thing, and when it's extractable
information, you're going to extract it and try to get a win,"
Rao said. "We all try to be good citizens, but we're all
competing at the same time."
Howard Lutnick, President Donald Trump's nominee for
Secretary of Commerce who would oversee future export controls
on AI technology, told the U.S. Senate during a confirmation
hearing on Wednesday that it appeared DeepSeek had
misappropriated U.S. AI technology and vowed to impose
restrictions.
"I do not believe that DeepSeek was done all above
board. That's nonsense," Lutnick said. "I'm going to be rigorous
in our pursuit of restrictions and enforcing those restrictions
to keep us in the lead."
David Sacks, the White House's AI and crypto czar, also
raised concerns about DeepSeek distillation in a Fox News
interview on Tuesday.
DeepSeek did not immediately answer a request for
comment on the allegations.
OpenAI added it will work with the U.S. government to
protect U.S. technology, though it did not detail how.
"As the leading builder of AI, we engage in countermeasures
to protect our IP, including a careful process for which
frontier capabilities to include in released models," the
company said in a statement.
The most recent round of concern in Washington about
China's use of U.S. products to advance its tech sector is
similar to previous
concerns about the semiconductor industry
, where the U.S. has imposed restrictions on what chips and
manufacturing tools can be shipped to China and is examining
restricting work on certain
open technologies
.
NEEDLE IN A HAYSTACK
Technologists said blocking distillation may be harder
than it looks.
One of DeepSeek's innovations was showing that a relatively
small number of data samples - fewer than one million - from a
larger, more capable model could drastically improve the
capabilities of a smaller model.
When popular products like ChatGPT have hundreds of millions
of users, such small amounts of traffic could be hard to detect
- and some models, such as Meta Platforms' ( META ) Llama and
French startup Mistral's offerings, can be downloaded freely and
used in private data centers, meaning violations of their terms
of service may be hard to spot.
"It's impossible to stop model distillation when you have
open-source models like Mistral and Llama. They are available to
everybody. They can also find OpenAI's model somewhere through
customers," said Umesh Padval, managing director at Thomvest
Ventures.
The license for Meta's Llama model requires those using it
for distillation to disclose that practice, a Meta spokesperson
told Reuters.
DeepSeek in a paper did disclose using Llama for some
distilled versions of the models it released this month, but did
not address whether it had ever used Meta's model earlier in the
process. The Meta spokesperson declined to say whether the
company believed DeepSeek had violated its terms of service.
One source familiar with the thinking at a major AI lab said
the only way to stop firms like DeepSeek from distilling U.S.
models would be stringent know-your-customer requirements
similar to how financial companies identify with whom they do
business.
But nothing like that is set in stone, the source said. The
administration of former President Joe Biden had put forth such
requirements, which President Donald Trump may not embrace.
The White House did not immediately respond to a request for
comment.
Jonathan Ross, chief executive of Groq, an AI computing
company that hosts AI models in its cloud, has taken the step of
blocking all Chinese IP addresses from accessing its cloud to
block Chinese firms from allegedly piggybacking off the AI
models it hosts.
"That's not sufficient, because people can find ways to get
around it," Ross said. "We have ideas that would allow us to
prevent that, and it's going to be a cat and mouse game ... I
don't know what the solution is. If anyone comes up with it, let
us know, and we'll implement it."