On January 20, the Chinese AI startup DeepSeek released its reasoning-oriented open-source model DeepSeek-R1. Over the weekend of January 25-26, the neural network attracted community attention, leading to sell-offs in stock and cryptocurrency markets. DeepSeek: The New AI Powerhouse - What is it? Chinese AI startup DeepSeek is an artificial intelligence startup founded in 2023 in Hangzhou, China. The company specializes in developing large open-source language models and has gained recognition for its innovative approach and achievements. In November, DeepSeek introduced the thinking “super powerful” AI model DeepSeek-R1-Lite-Preview. According to published tests, it performs on par with OpenAI’s o1-preview. At the end of December, the firm showcased its own LLM V3, which surpassed competitors from Meta and OpenAI in tests. DeepSeek's open source model competes with leading AI technologies, offering advanced reasoning and performance benchmarks. DeepSeek V3 has 671 billion parameters. In comparison, Llama 3.1 has 405 billion parameters. This metric reflects the AI’s ability to adapt to more complex applications and provide more accurate responses. The development of the neural network took two months, costing $5.58 million and requiring significantly fewer computational resources compared to larger tech companies. Nvidia H800 chips were used, optimizing the use of computing power in the model training process. Thanks to the new AI model DeepSeek-R1, the company’s chatbot skyrocketed in the rankings of free apps on the App Store in the USA, surpassing even ChatGPT. Introduction to DeepSeek DeepSeek is a Chinese AI startup that has been making waves in the global AI community with its cutting-edge, open-source models and low inference costs. Founded in 2023 by Liang Wenfeng, a former head of the High-Flyer quantitative hedge fund, DeepSeek has quickly risen to the top of the AI market with its innovative approach to AI research and development. With a focus on open-source innovation, longer context windows, and dramatically lower usage costs, DeepSeek has positioned itself as a viable alternative to more expensive, proprietary platforms. DeepSeek R1 — the killer of OpenAI's o1 DeepSeek offered performance comparable to top models at a much lower cost. In several tests conducted by third-party developers, the Chinese model outperformed Llama 3.1, GPT-4o, and Claude Sonnet 3.5. Experts tested the AI for response accuracy, problem-solving capabilities, mathematics, and programming. DeepSeek faces significant deepseek challenges in a competitive landscape dominated by technology giants like OpenAI, Google, and Meta. These challenges could impact its growth and adoption, particularly in terms of resource allocation and the effectiveness of its innovative approach compared to proprietary models. “ The developers have indeed managed to create an open-source neural network that performs computations efficiently in output mode. We must take China’s developments very seriously,” commented Microsoft CEO Satya Nadella at the World Economic Forum in Davos (Switzerland). DeepSeek also surprised by managing to circumvent U.S. export control restrictions. “The Chinese company DeepSeek may pose the greatest threat to American stock markets since it appears to have built a revolutionary AI model at an extremely low cost and without access to advanced chips, calling into question the utility of hundreds of billions in investments pouring into this sector,” commented journalist Holger Zschäpitz. DeepSeek introduced “distilled” versions of R1 ranging from 1.5 billion parameters to 70 billion parameters. The smallest can run on a laptop. In one example, DeepSeek R1 was even launched on a smartphone. The larger version requires powerful hardware but is available via API at a price 90-95% lower than OpenAI’s o1—$0.14 per million tokens compared to $7.5 for its American competitor. To achieve high performance at lower costs, Chinese developers “rethought everything from scratch,” creating innovative and cost-effective AI tools. Paradigm Shift in the Global AI Landscape Morgan Brown, Vice President of Product at Dropbox, explained DeepSeek’s approach and technical solutions: “Traditional AI is like writing every number with 32 decimal places. At DeepSeek they thought: ‘What if we only use 8? That would still be accurate enough!’ Boom—75% less memory.” DeepSeek also implemented a “multi-token” system. Standard AI “reads like a first-grader”: “The cat… sat…”. The Chinese neural network reads entire phrases at once, twice as fast and with 90% more accuracy. DeepSeek's use of Multi-Head Latent Attention (MLA) significantly improves model efficiency by distributing focus across multiple attention heads, enhancing the ability to process various data streams simultaneously. “But here’s what is really smart: they created an ‘expert system.’ Instead of one massive AI trying to know everything (like if one person were a doctor, lawyer, and engineer), they have specialized experts that activate only when necessary,” noted Brown. In traditional models, all 1.8 trillion parameters are active all the time. DeepSeek has 671 billion parameters but only 37 billion are active simultaneously. “It’s like having a huge team but only bringing in those specialists who are truly needed for each task,” added Dropbox’s VP of Product. The results are “mind-blowing,” noted experts:Training cost: $100 million → $5 million; Required GPUs: 100,000 → 2,000; API costs: 95% cheaper; Can run on gaming GPUs. DeepSeek's large language models bypass traditional supervised fine-tuning in favor of reinforcement learning, allowing them to develop advanced reasoning and problem-solving capabilities independently. “But wait,” you might say; “there must be some catch!” That’s just it — everything is open-source. Anyone can verify their work. The code is publicly available. Everything is explained in technical documents. This isn’t magic; it’s just incredibly smart engineering,” concluded Brown. DeepSeek achieved these results with a team of fewer than 200 people. However, R1 has a downside — censorship. Being a Chinese model, it is subject to government control. Its responses will not touch on Tiananmen Square or Taiwan’s autonomy. “The impressive performance of DeepSeek’s distilled models means that highly capable reasoning systems will continue to be widely disseminated and run on local equipment away from any oversight,” noted AI researcher Dean Ball from George Mason University. DeepSeek's ambition to develop artificial general intelligence (AGI) as part of its long-term vision highlights their commitment to advancing AI capabilities beyond current limitations. AI Models and Innovations DeepSeek has developed a range of AI models that have been praised for their reasoning capabilities, problem-solving capabilities, and cost-effectiveness. The company’s flagship model, DeepSeek R1, is a large language model that has been trained using a reinforcement learning (RL) approach, allowing it to learn independently and develop self-verification, reflection, and chain-of-thought (CoT) capabilities. DeepSeek R1 has been released in six smaller versions that are small enough to run locally on laptops, with one of them outperforming OpenAI’s o1-mini on certain benchmarks. DeepSeek’s AI models are designed to be highly efficient, with a focus on maximizing software-driven resource optimization and embracing open-source methods. This approach not only mitigates resource constraints but also accelerates the development of cutting-edge technologies. DeepSeek’s models are also highly scalable, with performance improving with longer reasoning steps. Business Model and Partnerships DeepSeek’s business model is unique in that it is financed entirely by High-Flyer, a successful quantitative hedge fund. This arrangement allows DeepSeek to operate without the pressures of shareholder demands or meeting aggressive Series A milestones. DeepSeek has positioned itself as a viable alternative to more expensive, proprietary platforms, with incredibly low API pricing. DeepSeek has also partnered with other companies and organizations to advance its AI research and development. For example, the company has collaborated with Hugging Face on the Open R1 initiative, an ambitious project aiming to replicate the full DeepSeek R1 training pipeline. If successful, this initiative could enable researchers around the world to adapt and refine R1-like models, further accelerating innovation in the AI space. Future Outlook and Challenges DeepSeek’s rapid rise comes with challenges that could shape its future, including U.S. export controls and market perception issues. The company must consistently prove its reliability, especially for enterprise-grade deployments, and navigate the fast-evolving AI landscape. Despite these challenges, DeepSeek’s future outlook is promising. The company’s commitment to open-source innovation and its focus on developing highly efficient and scalable AI models have positioned it as a leader in the global AI landscape. As the AI market continues to evolve, DeepSeek is well-positioned to capitalize on emerging trends and opportunities. However, DeepSeek also faces challenges related to the geopolitical implications of its Chinese origins. The company must navigate the complex landscape of export controls and regulatory frameworks, while also addressing concerns about potential biases in its training data. Overall, DeepSeek’s future success will depend on its ability to balance innovation with accountability, while also navigating the complex geopolitical landscape of the AI industry. Sell-off in the AI Market The sharp rise in DeepSeek’s popularity led to sell-offs in stocks and cryptocurrencies. Investors became concerned about a bubble in the artificial intelligence sector. American AI startups are spending billions on training neural networks while their valuations reach hundreds of billions of dollars. DeepSeek demonstrated that this isn’t necessary. On January 27, shares of Japanese companies involved in chip production fell sharply. There was also significant decline observed in the American stock market, particularly affecting shares of Nvidia —the main beneficiary of the AI boom. Sell-offs in TradFi led to declines in cryptocurrencies, especially those related to artificial intelligence tokens. AI agents were particularly hard-hit as crypto investors seemed to be “digesting” DeepSeek’s influence on the future of the AI sector within digital assets. The open source coding model, exemplified by DeepSeek Coder and DeepSeek-R1, has democratized access to advanced AI capabilities, fostering collaboration and customization. This model is particularly appealing to independent developers and startups looking for alternatives to expensive proprietary systems. Notably, on January 27, quotes for Bitcoin fell below $100,000 with leading altcoins showing even deeper declines.