New Delhi (Tech Desk): Chinese AI lab DeepSeek, which recently launched DeepSeek-V3, is now back with another powerful reasoning large language model called DeepSeek-R1. The new model, which matches OpenAI’s frontier model O1 in tasks such as math, coding, and general knowledge, is 90-95 percent more economical than O1. DeepSeek-R1 is designed as an AI that not only answers questions but also reasons on problems like humans. This new open-source reasoning model has been developed by Chinese AI startup DeepSeq, which launched its cutting-edge and open-source AI model DeepSeq-v3 earlier this month. This model outperformed models from META and OpenAI, while costing significantly less than those models.
What is DeepSeek-R1?
DeepSeek’s new AI model is a state-of-the-art reasoning model designed to enhance the problem-solving and analytical capabilities of AI systems. Based on the research paper, the new model includes two main versions – DeepSeek-R1-Zero and DeepSeek-R1. DeepSeq-R1-Zero is trained entirely via reinforcement learning (RL) without any supervised fine-tuning. DeepSeq-R1 builds on the foundation laid by R1-Zero, including a cold-start phase with carefully curated data and multi-stage RL, ensuring superior reasoning capabilities and readability.
Model Performance
DeepSeek-R1 has delivered some remarkable performance on benchmarks. On Mathematics (AIME 2024), this model scored 79.8 percent (pass@1), which is at par with OpenAI’s O1. In another benchmark on mathematics, MATH-500, the DeepSeek-R1 model achieved 93 percent accuracy, surpassing most benchmarks. Codeforces, a benchmark for coding, ranked the model in the 96.3rd percentile of human participants, demonstrating expert-level coding capabilities in the model. On general knowledge, in benchmarks such as MMLU and GPQA Diamond, DeepSeek-R1 achieved 90.8 percent and 71.5 percent accuracy, respectively. In AlpacaEval 2.0, a benchmark that tests AI model writing and question answering, DeepSeek-R1 achieved 87.6 percent win rate.
Its use cases
Since DeepSeek-R1 is capable of solving complex logic and mathematical problems, this model can be an excellent tool for advanced education or tutoring systems. Given its great coding benchmarks, it can also be employed in software development, especially in code generation and debugging tasks. Based on its long-context understanding and strong question-answering capabilities, this model could also prove extremely valuable in research.