Under the wing of China: Deepseek is preparing a new model with artificial intelligence
Deepseek strives to make the most of his advantage. The Chinese startup caused a sale of more than 1 trillion. Dollars in the world's stock markets with a lower-budget model with artificial intelligence at reduced prices, which has surpassed many Western competitors.
The Hangzhou -based company is now accelerating the launch of the R1 heir since January, according to three people familiar with the company. Deepseek was planning to release R2 in early May, but now he wants to get out as early as possible, two of them said without providing details.
The company says it hopes the new model will lead to better encoding and be able to think in languages beyond English. Details of the accelerated timeline for the launch of R2 have not been reported so far.
Deepseek did not respond to a request for a comment on this story.
The rivals still understand the effects of R1, which was created with less powerful NVIDIA chips, but is competitive with those developed at the cost of hundreds of billions of dollars from US technology giants. « The launch of the Deepseek R2 model can be a key moment in the AI industry, » said Viajimha Aligatta, Chief Operations Officer of the Indian Technology Services Provider Zensar. Deepseek's success in creating profitable artificial intelligence models « is likely to encourage companies around the world to speed up their own efforts … interrupting the grip of few dominant players in this area, » he said.
R2 is likely to bother the US government, which has identified leadership in AI as a national priority. Putting it can even more stimulate Chinese authorities and companies, dozens of which say they have started to integrate Deepseek models into their products.
How Deepseek learns from « western » chatbots and can he be trusted
Little is known about Deepseek, whose founder Liang Wanfen became a billionaire through his High-Flyer hedge fund. Liang, who was described by a former employer as « restrained and introverted », has not spoken to any media since July 2024.
Reuters interviews a dozen former employees, as well as professionals from funds familiar with Deepseek operations and her Mother High-Flyer. He also reviewed articles in the state media, publications on social networks from companies and scientific articles dating from 2019.
They have told a history of a company that functions more as a research laboratory than as a profit purpose and is not burdened by the hierarchical traditions of the Chinese technology industry with high voltage even when it becomes responsible for what many investors consider the latest breakthrough in Artificial intelligence.
Different road
Liang was born in 1985 in a provincial village in the southern province of Guangdong. He later received diplomas for a communication engineer at the Elite University « Judzyan ».
One of his first jobs is the management of a research department in a Shanhai intelligent work company. His then boss, Zhou Chaoen, told the state media on February 9 that Liang had hired awarded algorithm engineers and worked with a « flat style of governance ».
In Deepseek and High -Flyer Liang, it is similarly avoiding the practices of Chinese technology giants, known for strictly top -down, low pay for young employees and « 996 » – work from 9am to 9pm, six days a week.
Liang opened his office in Beijing at a walking distance from the University of Tsinhua and the University of Beijing, the two most prestigious educational institutions in China. He regularly deepened in technical details and was happy to work with GEN-Z trainees and recently graduated students who made up most of the workforce, according to two former employees. They also usually described an eight -hour day in an atmosphere of cooperation.
Who is the Hedge Fund behind Chinese Deepseek
« Liang gave us control and treated us as experts. He was constantly asking questions and learning with us, » said 26-year-old researcher Benjamin Liu, who left the company in September. « Deepseek allowed me to own critical parts of the line, which was very exciting. »
Liang did not answer questions sent via Deepseek.
While Baidu and other Chinese technology giants competed to create their consumer -oriented versions of Chatgpt in 2023 and win from the global boom in artificial intelligence, Liang told Chinese media Waves last year that he deliberately avoided spending a lot on developing a lot Applications, focusing instead on improving the quality of the AI model.
Both Depepeek and High-Flyer are known for paying generously, according to three people familiar with their compensation practices. In High -Flyer, it is not uncommon for a senior scientist to earn $ 1.5 million a year (about $ 206,000 – note), while competitors rarely pay more than 800,000, said one of the people, a competitive fund manager , who knows Liang.
The generosity is funded by High-Flyer, which has become one of the most successful such funds in China and even after government repression against the sector still run tens of billions of Yuan, according to two people in the industry.
Computing power
Deepseek's success with a cheap model is based on the decades and significant investment of High-Flyer in research and computing power, three people said.
The Fund was wounded pioneer in the artificial intelligence trade. A high-ranking CEO said in 2020 that High-Flyer was engaging in full forces AI, reinvesting 70% of its revenue, especially in AI studies.
High-Flyer spent 1.2 billion yuan on two supercomputer clusters for artificial intelligence in 2020 and 2021. The second cluster, Fire-Flyer II, consists of about 10,000 NVIDIA A100 chips used to train models.
Deepseek was not created at the time, so the accumulation of computing power has attracted the attention of Chinese securities regulators, said a person with a direct knowledge of employee thinking. « The regulators wanted to know: Why do they need so many chips? » The interlocutor said. « How would they use them? What impact would this have on the market? »
Authorities decided not to interfere – a move that would be decisive about Deepseek's fate: The United States banned the export of A100 chips to China in 2022, at which point Fire -Flyer II was already in operation. Beijing now celebrates Deepseek, but instructs him not to engage in the media without approval according to a person familiar with Chinese formal thinking.
Authorities have asked Liang not to speak, worried that too much noise in the media will attract unnecessary attention, said the interlocutor.
Italy ordered Deepseek to block its service in the country
The Cabinet and the Ministry of Trade of China, as well as the Chinese Securities Regulator, did not respond to requests for comment.
As one of the few companies with a large A100 cluster, High-Flyer and Deepseek have been able to attract some of the best research talents in China, two former employees said. « The main advantage of huge (computing) resources is that it allows large -scale experimentation, » said Liu, the former employee.
Some Western entrepreneurs in the field, such as Scale Ai CEO Alexander Wang, claim that Deepseek has up to 50,000 higher-end NVIDIA chips prohibited for export to China. He did not provide evidence of the prosecution, nor did he respond to Reuters' requests to provide evidence.
Deepseek did not respond to Wang's claims. Two former employees attribute the success of the company to Liang's focus to the more profitable AI architecture.
- The startup has used techniques such as Mixture-of-Experts (Moe) and multi-headed latent attention (MLA), which lead to much lower computing costs, its research documents show.
- The MOE technique divides the AI model into different areas of expertise and only activates those related to request, unlike the more common architectures that use the whole model.
- The MLA architecture allows the model to process various aspects of one part of the information at the same time, helping it to discover key details more efficiently.
While competitors like the French Mistral have developed Moe-based models, Deepseek is the first company to depend on this architecture while achieving equality with more expensive models.
Deepseek prices are 20 to 40 times cheaper than those for equivalent OPENAI models, analysts from Bernstein brokerage company in early February.
So far, the western and Chinese technology giants have reported that they plan to continue the heavy costs of AI, but Deepseek's success with R1 and its earlier V3 model has led some to change strategies.
Openai has reduced prices this month while Google's Gemini has introduced reduced access levels. After the release of R1, Openai also released the O3-Mini model, which relies on less computing power.
Adnan Masood of the US technology service provider UST told Reuters that its laboratory has conducted a benchmark tests that found that R1 often uses three times more tokens or units of data processed by the AI model compared to the reduced model of Openai.
The state accepts
Even before the R1 attracted worldwide attention, there were signs that Depepeek had attracted Beijing's favor. In January, the state media reported that Liang had attended a meeting with Chinese Prime Minister Li Corn in Beijing as a certain representative of the AI sector before the leaders of more famous companies.
Congressmen insist on a law to disable depepeek on federal devices
The subsequent fanfares on the competitiveness of the cost of his models supported Beijing's faith that he could surpass innovations in the United States, with Chinese companies and state authorities perceived Deepseek models at not off -companies.
At least 13 Chinese city governments and 10 state -owned energy companies say they have implemented Depepeek into their systems, while the Lenovo, Baidu and Tencent technology giants – the owner of the largest social media app in China WeChat – have integrated Deepseek models into their products. |
Chinese leader Zhinpin and Li « signaled that they were supporting Deepseek, » said Alfred Wu, an expert on the creation of Chinese Public Policy School Lee Quan Yu in Singapore. « Now everyone just approves it. »
Acceptance from China takes a time when governments from South Korea to Italy remove Deepseek from national app stores, citing confidentiality concerns. « If Depepeek becomes the AI model in Chinese state organizations, Western regulators can see this as another reason for escalation of AI chips or software cooperations, » said Stephen Wu, an artificial intelligence expert and founder of the Carthage Hed Fonda Capital.
Further restrictions on sophisticated AI chips are a challenge that Liang admits. « Our problem has never been funding, » he told Waves in July. « This is the embargo on high -end chips. »