
This week, Chinese AI lab DeepSeek gained widespread attention when its chatbot software topped the charts on the Apple software Store (and Google Play, too). Wall Street analysts and engineers are now wondering if the United States can continue to dominate the AI race and if there would be a sustained demand for AI chips as a result of DeepSeek’s AI models, which were developed utilizing compute-efficient methods.
In its most basic form, an AI chatbot receives input data, analyses it, and produces a pertinent response. When a user asks a question on the website, the AI chatbot will consider their purpose along with other elements like tone and sentiment before attempting to provide the most appropriate response.
DeepSeek is now widely used. Where did DeepSeek originate, and how did it become so well-known around the world so fast? These questions run through the mind.
Taking a look at the origins of DeepSeek as a merchant, as a High-Flyer Capital Management, a Chinese quantitative hedge fund that leverages AI to guide its trading choices, supports DeepSeek.
In 2015, Liang Wenfeng, an AI enthusiast, co-founded High-Flyer. According to reports, Wenfeng started experimenting with trading while attending Zhejiang University. In 2019, he established High-Flyer Capital Management, a hedge fund dedicated to creating and implementing AI algorithms.
DeepSeek was established by High-Flyer in 2023 as a facility devoted to studying AI tools apart from its financial operations. The lab split out into its own business, DeepSeek, with High-Flyer as one of its investors.
DeepSeek created its own data center clusters for model training right away. However, DeepSeek has been impacted by U.S. hardware export restrictions, just like other AI firms in China. The company was compelled to employ Nvidia H800 processors, a less potent variant of the H100 chip that is accessible to American businesses, in order to train one of its more current models.
It is stated that the technological staff of DeepSeek is primarily young. From the reports, the corporation actively seeks out PhD AI researchers from prestigious Chinese institutions. According to The New York Times, DeepSeek also employs non-computer scientists to assist its tech better comprehend a variety of topics.
Looking at the robust models of DeepSeek, Deepseek in November 2023 released its initial set of models, which included DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat. However, the AI industry didn’t start paying attention until this spring, when the firm unveiled its next-generation DeepSeek-V2 family of models.
In addition to doing well on a number of AI benchmarks, DeepSeek-V2, a general-purpose text and picture analysis system, was far less expensive to operate than similar models at the time. It compelled ByteDance and Alibaba, two of DeepSeek’s domestic rivals, to lower the usage fees for some of their models and make others totally free.
The December 2024 release of DeepSeek-V3 only increased DeepSeek’s reputation.
DeepSeek V3 performs better than both “closed” models that are only accessible via an API, such as OpenAI’s GPT-4o, and downloadable, publicly available models, such as Meta’s Llama, according to DeepSeek’s internal benchmark testing.
The R1 “reasoning” model of DeepSeek is equally outstanding. According to DeepSeek’s January release, R1 outperforms OpenAI’s o1 model on important metrics.
R1 successfully fact-checks itself since it is a reasoning model, which helps it stay clear of some of the common mistakes that models make. In comparison to a standard non-reasoning model, reasoning models typically take a little longer to arrive at answers, ranging from seconds to minutes. On the plus side, they are often more trustworthy in fields like math, science, and physics.
However, R1, DeepSeek V3, and the other DeepSeek models have drawbacks. Since the AI was created in China, China’s internet regulator is able to benchmark it to make sure that its replies “embody core socialist values.” For instance, R1 in DeepSeek’s chatbot software won’t respond to inquiries concerning Taiwan’s autonomy or Tiananmen Square.
It’s unclear exactly what DeepSeek’s business model is, if it has one. The business offers certain of its goods and services for free while pricing others much below market value. Even though there is a lot of VC interest, it is not accepting investor funds.
According to DeepSeek, it has been able to sustain exceptional cost competitiveness through efficiency advancements. However, several experts contest the numbers provided by the corporation.
In any event, developers have embraced DeepSeek’s models, which are accessible under permissive licenses that permit commercial usage but aren’t open source in the traditional sense of the word. Clem Delangue, the CEO of Hugging Face, one of the platforms that houses DeepSeek’s models, claims that over 500 “derivative” models of R1 have been developed on Hugging Face and have received a total of 2.5 million downloads.
DeepSeek’s triumph against bigger and more well-established competitors has been characterized as “over-hyped” and “upending AI.” The company’s performance was at least partially to blame for the 18% decline in Nvidia’s stock price in January and for prompting OpenAI CEO Sam Altman to address the public. According to Reuters, U.S. Commerce department bureaus informed employees in March that DeepSeek would not be allowed on their official devices.
Microsoft declared that DeepSeek is accessible through its Azure AI Foundry service, which is a platform that unifies corporate AI services under one roof. CEO Mark Zuckerberg stated that investing in AI infrastructure will remain a “strategic advantage” for Meta when questioned about DeepSeek’s effect on the company’s AI expenditure during its first-quarter earnings call. OpenAI referred to DeepSeek as “state-subsidized” and “state-controlled” in March and suggested that the US government look into outlawing DeepSeek models.
CEO Jensen Huang highlighted DeepSeek’s “excellent innovation” on the company’s fourth-quarter results call, stating that it and other “reasoning” models are ideal for Nvidia due to their significant computational needs.
Meanwhile, some businesses, as well as entire nations and governments, including South Korea, are outlawing DeepSeek. Additionally, DeepSeek was prohibited from being utilized on government equipment in New York State.
It’s unclear what the future holds for DeepSeek. Better models are inevitable. However, it seems that the U.S. administration is becoming more cautious about what it considers to be detrimental foreign influence. The Wall Street Journal stated in March that DeepSeek will probably not be allowed on government equipment in the United States.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.