DeepSeek’s
Rise: How a Chinese Start-Up Went From Stock Trader to A.I. Star
The
little-known artificial intelligence firm has emphasized research, even as it
emerged as the brainchild of a hedge fund.
Meaghan
Tobin Paul Mozur Alexandra Stevenson
By Meaghan
TobinPaul Mozur and Alexandra Stevenson
Meaghan
Tobin and Paul Mozur reported from Taipei, Taiwan, and Alexandra Stevenson from
Hong Kong.
https://www.nytimes.com/2025/01/28/business/deepseek-owner-china-ai.html
Jan. 28,
2025
Two years
ago, when big-name Chinese technology companies like Baidu and Alibaba were
chasing Silicon Valley’s advances in artificial intelligence with splashy
announcements and new chatbots, DeepSeek took a different approach. It zeroed
in on research.
The strategy
paid off.
The Chinese
start-up has jolted the tech world with its claim that it created a powerful
A.I. model that was significantly cheaper to build than the offerings of its
better-funded American rivals.
In the
rivalry between China and the United States over domination of artificial
intelligence, DeepSeek seemed to come out of nowhere. In fact, it has
skyrocketed through China’s tech world in recent years with a path that was
anything but conventional.
Its mission
to pursue research mirrors that of companies like OpenAI, the Silicon Valley
firm that marked an American signature over A.I. in the fall of 2022. But the
similarities mostly end there.
DeepSeek’s
origins are in finance, not technology for technology’s sake. Its parent
company, a Chinese hedge fund called High-Flyer, began not as a laboratory
devoted to safeguarding humanity from A.I. like Open AI, but as a business
using A.I. to make bets in the Chinese stock market.
High-Flyer
had thrived by capitalizing on a market dominated by China’s retail investors,
who are known for jumping in and out of stocks impulsively. In 2021, High-Flyer
found itself pressured by regulatory crackdowns in China on speculative
trading, which the authorities in Beijing felt was at odds with their attempts
to keep markets calm.
So
High-Flyer pursued a new opportunity that it said aligned better with Chinese
government priorities: advanced A.I.
“We want to
do things with greater value and things that go beyond the investment industry,
but it has been misinterpreted as A.I. stock speculation,” High-Flyer’s chief
executive, Lu Zhengzhe, told Chinese state media in 2023. “We have set up a new
team independent of investment, which is equivalent to a second start-up.”
DeepSeek was
born. As with many other Chinese start-ups, DeepSeek came at an established
market with a different business approach.
DeepSeek’s
latest model for artificial intelligence is believed to be nearly as powerful
as American rivals but far more efficient. Its success suggests that Silicon
Valley’s A.I. lead has shrunk. DeepSeek’s breakthrough, despite efforts by
Washington to limit Chinese access to the advanced chips needed for A.I.,
raises questions about how effective those controls can be long term — although
DeepSeek’s founder has acknowledged that the chip restrictions are a
limitation.
DeepSeek did
not rely on making consumer-facing A.I. products for revenue, and only this
month released its first chatbot, which allows anyone to generate text and
photos with simple commands. Instead, the company used the money that
High-Flyer made from stock trading to bankroll ambitious research. The approach
set it apart from U.S. rivals, all of which are ultimately consumer technology
companies.
This
unconventional approach also allowed DeepSeek to sidestep stringent regulations
the Chinese government has placed on A.I. use by the public. Because its focus
was research and selling to businesses who use its model — and, until the
release of its chatbot this month, not consumer applications — its early work
did not trigger the same government restrictions.
DeepSeek is
run by its chief executive, Liang Wenfeng, a thin, bespectacled engineer who
studied at Zhejiang University in the eastern city of Hangzhou. He has said
repeatedly in the few interviews he has given to Chinese media that to catch up
with American innovation, Chinese companies must put research before profits.
DeepSeek and High-Flyer did not respond to requests for comment.
What Chinese
technology companies “lack in innovation is certainly not capital, but a lack
of confidence and knowledge about how to organize a high density of talent to
achieve effective innovation,” he said in a widely circulated interview with
Chinese tech outlet 36Kr.
Those who
have worked with Mr. Liang describe him as a capable manager with a deep
technical background, according to interviews and public accounts.
“He’s
definitely an INTP,” said Zihan Wang, a computer engineer who worked on an
earlier DeepSeek model, referring to an introspective personality type from the
Myers-Briggs test, a popular personality test among young people in China.
“INTPs are really good researchers and they have a willingness to explore,” Mr.
Wang said. “He is not one of those people who wants to control everything.”
Mr. Liang
was not too bothered with details like project timelines, and occasionally sent
thought-provoking research questions to the entire team of researchers, Mr.
Wang said. But mostly, Mr. Liang seemed driven to advance the technology and
was not focused on profits.
Unlike many
Chinese companies, which tend to focus on hiring programmers, Mr. Liang has
gained a reputation for employing people from outside of computing. Poets and
humanities majors from China’s top universities on DeepSeek’s staff train the
model to write classical Chinese poetry and ace questions taken from the
country’s difficult college entrance examination.
“Most of the
team graduated from the top universities in China,” said Yineng Zhang, a lead
software engineer at Baseten in San Francisco who works on the SGLang, a
project not part of DeepSeek that helps people build on top of DeepSeek’s
system. “They are very smart and very young.”
For years,
Chinese tech companies pioneered artificial intelligence applications used in
computer vision, like facial recognition. But OpenAI’s release of ChatGPT
prompted a reckoning. When no Chinese company immediately released anything
comparable, many concluded that American companies had a lead in advanced A.I.
In China,
computer scientists were determined to prove they could compete. In 2023, many
companies in China released their own large language models, the technology
that underpins chatbots like ChatGPT.
But making
advanced models would require using a large number of chips that would cost
hundreds of millions of dollars.
High-Flyer
was spending, too. By 2021, it was one of just a handful of Chinese companies
that had been able to stockpile more than 10,000 advanced Nvidia A100 chips.
Yet
DeepSeek’s research gave it a surprising advantage. Last year, it dramatically
cut the prices it charged developers who build applications using its model,
prompting a price war with larger rivals.
Mr. Wang,
the engineer who previously worked at DeepSeek, said there was little
discussion of commercial applications for the technology they were building.
Instead, he said, the company was focused on making an A.I. system that could
be used by a range of people for many purposes.
“During my
time there, we did not talk much about how we make money,” Mr. Wang said. “They
just focused on making a great foundation model.”
A crucial
part of DeepSeek’s popularity is that it has made its developers’ work public.
This kind of information sharing, called open source, has been a cornerstone of
the development of computer software, the internet and now artificial
intelligence.
In the
United States, A.I. researchers and entrepreneurs have long followed the
progress of DeepSeek’s technology. Last year, the company turned heads when it
released systems designed to generate their own computer programs.
A new
challenge for the company may come with its new high profile. The same day it
released R1, the model behind its new chatbot, last week, Mr. Liang appeared at
a round table discussion with Li Qiang, China’s premier.
DeepSeek’s
sudden popularity has thrust it to the center of the Chinese Communist Party’s
efforts to spur innovation, and that could prove difficult to manage, said
Jimmy Goodrich, a senior adviser for technology analysis to the RAND
Corporation, a federally funded think tank. “It’s a big predicament for
DeepSeek. I’m sure they weren’t on the government’s five-year plan,” he said.
“Can they
maintain this chaotic carefree vision when both the party and the world is
watching?”
Zixu Wang
contributed research from Hong Kong.
Meaghan
Tobin covers business and tech stories in Asia with a focus on China and is
based in Taipei. More about Meaghan Tobin
Paul Mozur
is the global technology correspondent for The Times, based in Taipei.
Previously he wrote about technology and politics in Asia from Hong Kong,
Shanghai and Seoul. More about Paul Mozur
Alexandra
Stevenson is the Shanghai bureau chief for The Times, reporting on China’s
economy and society. More about Alexandra Stevenson
Sem comentários:
Enviar um comentário