The Microsoft piece also goes over various flavors of distillation, including response-based distillation, feature-based distillation and relation-based distillation. It also covers two fundamentally different modes of distillation – off-line and online distillation.
DeepSeek’s success learning from bigger AI models raises questions about the billions being spent on the most advanced technology.
Since Chinese artificial intelligence (AI) start-up DeepSeek rattled Silicon Valley and Wall Street with its cost-effective models, the company has been accused of data theft through a practice that is common across the industry.
David Sacks says OpenAI has evidence that Chinese company DeepSeek used a technique called "distillation" to build a rival model.
One possible answer being floated in tech circles is distillation, an AI training method that uses bigger "teacher" models to train smaller but faster-operating "student" models.
OpenAI accuses Chinese AI firm DeepSeek of stealing its content through "knowledge distillation," sparking concerns over security, ethics, and national interests.
Top White House advisers this week expressed alarm that China's DeepSeek may have benefited from a method that allegedly piggybacks off the advances of US rivals called "distillation."
Microsoft and OpenAI are investigating whether DeepSeek, a Chinese artificial intelligence startup, illegally copying proprietary American technology, sources told Bloomberg
After DeepSeek AI shocked the world and tanked the market, OpenAI says it has evidence that ChatGPT distillation was used to train the model.
OpenAI believes DeepSeek used a process called “distillation,” which helps make smaller AI models perform better by learning from larger ones.
DeepSeek’s AI breakthrough challenges Big Tech with a cheaper, efficient model. This may be bad for the incumbents, but good for everybody else.
The arrival of a Chinese upstart has shaken the AI industry, with investors rethinking their positioning in the space.