AI界重磅炸弹:“50美元”复现DeepSeek R1?

简介: AI界重磅炸弹:“50美元”复现DeepSeek R1?

AI界重磅炸弹:“50美元”复现DeepSeek R1


近期,斯坦福大学李飞飞团队的研究人员与华盛顿大学等研究机构,共同发布了一款新的模型:s1,在数学和编程能力的评测上展现出较为优秀的水准。这一消息无疑在AI界投下了一颗重磅炸弹,各方跃跃欲试,然而复现是否真的如传闻中那么容易呢?


“50美元26分钟”复现DeepSeek R1 其实没那么简单!


通义模型的“基座”作用


时间上来说,“50美元26分钟”仅指针对一个开源基础模型进行监督微调(SFT)所耗费的资源与时间,不包括前期的数据准备、基础模型的训练,以及各类相关组件的部署时间。尽管微调过程较快,但整个研究仍然依赖于SFT训练数据的整理和基础模型的预训练,这两个环节通常会耗时数周至数月。


s1模型背后借力了两款大模型。其一是Google 近期推出的Gemini Flash Thinking,负责生成1000条问题及其答案的思维链,作为训练数据集。其二是阿里云近期推出的Qwen2.5-32B-Instruct,作为SFT的基座模型。在这两者的共同支撑下,s1模型得以产生。


根据 s1模型的研究论文显示,s1模型是以阿里通义千问模型为基础微调,再通过 1000个样本数据完善训练,通义模型在其中起到了重要的基座作用。与“原生”深度推理模型不同,s1 不是经过强化学习(RLHF)训练,是利用1000条带有思维链的问题-答案对,使预训练基座模型(Qwen)变得更擅长推理。

5e443e5bfb41ab2fe85333117082f7a5.png

斯坦福s1论文原文注明模型是以阿里通义千问模型为基础微调

49f94bd8678d052c990f32f5ce0c10f3.png

国外多位人工智能研究者也指出,不少的“新”模型都是建立通义模型基础上


低成本训练大模型为AI领域提供了新的思考方向:


训练基于阿里云通义千问(Qwen)模型进行监督微调,s1模型并非从零开始,而是结合了预算约束训练(Budget Forcing)、极小样本训练、高效SFT、测试时扩展(test-time scaling)等技术与方法,构建出的一个低成本、高推理能力模型。这种低成本训练得益于已有的强大开源基础模型,展现了AI训练的潜力,依然是开源的优势显现,其研究思路无疑也为AI领域提供了新的思考方向……


其发布充分展现了全球大模型技术与产业的飞速发展趋势,也有力地证明当前全球大模型发展应是在高水平开放环境下的良性竞争。在全球范围内,各国顶尖企业、研究团队和人才在前沿技术领域持续竞逐,在竞争中相互借鉴,在融合中推动创新。这种开放合作的生态方能加速技术突破,并为全球人工智能产业的繁荣奠定坚实基础。


AI Community Bombshell: "$50" Recreation of DeepSeek R1


Recently, researchers from Li Feifei's team at Stanford University, in collaboration with institutions such as the University of Washington, released a new model: S1, which demonstrated commendable performance in mathematical and programming ability assessments. This news has undoubtedly sent shockwaves through the AI community, with many eager to try their hand at it. However, is the recreation really as easy as the rumors suggest?


The "50 dollars for 26 minutes" recreation of DeepSeek R1 is actually not that simple!


The "Foundation" Role of the Qwen Model


The "$50 for 26 minutes" only refers to the resources and time spent on supervised fine-tuning (SFT) of an open-source foundation model, excluding the initial data preparation, the training of the foundation model, and the deployment time for various related components. While the fine-tuning process is relatively fast, the entire research still relies on the organization of SFT training data and the pre-training of the foundation model, both of which typically take several weeks to months. The S1 model did not emerge out of thin air but stands on the "shoulders" of two top models in the industry. One is Google’s recently launched Gemini Flash Thinking, which is responsible for generating 1,000 questions and their corresponding answers as a reasoning chain for the training dataset. The other is Alibaba’s recently launched Qwen 2.5-32B-Instruct, which serves as the base model for the SFT. It is through the combined support of these two models that the S1 model was able to emerge.


According to the research paper on the S1 model, the S1 model is fine-tuned based on Alibaba's Qwen Model, and further trained with 1,000 sample data. The Qwen Model plays a crucial foundational role in this process. Unlike "native" deep reasoning models, the S1 model has not undergone reinforcement learning (RLHF) training. Instead, it uses 1,000 question-answer pairs with reasoning chains to enhance the pre-trained foundation model (Qwen), making it better at reasoning.

5e443e5bfb41ab2fe85333117082f7a5.png

The original Stanford S1 paper also explicitly states that the model was fine-tuned based on Alibaba's Qwen model.

49f94bd8678d052c990f32f5ce0c10f3.png

Many foreign AI researchers have also noted that numerous "new" models are actually built on the Qwen model foundation.


Low-Cost Training of Large Models Provides a New Direction for AI Field:


The S1 model, trained through supervised fine-tuning based on Alibaba Cloud's Qwen Model, does not start from scratch. Instead, it combines techniques such as budget-constrained training (Budget Forcing), extremely small sample training, efficient SFT, and test-time scaling to build a low-cost, high-reasoning-capability model. This low-cost training benefits from the existing powerful open-source foundation model, demonstrating the potential of AI training and showcasing the advantages of open-source. Its research approach undoubtedly provides a new direction for thinking in the AI field...


The release of the S1 model fully showcases the rapid development trend of global large model technology and industry, and it strongly proves that the future development of global large models should be driven by healthy competition in a high-level open environment. Around the world, top enterprises, research teams, and talent from various countries continue to compete in cutting-edge technology, learn from one another in the process, and drive innovation through collaboration. This open and cooperative ecosystem is the key to accelerating technological breakthroughs and laying a solid foundation for the prosperity of the global AI industry.

相关文章
|
1月前
|
人工智能 测试技术 API
构建AI智能体:二、DeepSeek的Ollama部署FastAPI封装调用
本文介绍如何通过Ollama本地部署DeepSeek大模型,结合FastAPI实现API接口调用。涵盖Ollama安装、路径迁移、模型下载运行及REST API封装全过程,助力快速构建可扩展的AI应用服务。
579 6
|
4月前
|
人工智能 数据可视化 安全
【保姆级教程】Dify+DeepSeek+MCP三件套:零门槛打造AI应用流水线,手把手实战教学!
本教程手把手教你用Dify+DeepSeek+MCP三件套零门槛搭建AI应用流水线:Dify提供可视化工作流编排,DeepSeek贡献128K长文本国产最强模型,MCP实现弹性部署。这套组合兼具低代码开发、高性能推理和灵活运维三大优势,助你快速落地企业级AI解决方案。
|
2月前
|
人工智能 IDE 开发工具
CodeGPT AI代码狂潮来袭!个人完全免费使用谷歌Gemini大模型 超越DeepSeek几乎是地表最强
CodeGPT是一款基于AI的编程辅助插件,支持代码生成、优化、错误分析和单元测试,兼容多种大模型如Gemini 2.0和Qwen2.5 Coder。免费开放,适配PyCharm等IDE,助力开发者提升效率,新手友好,老手提效利器。(238字)
455 1
CodeGPT AI代码狂潮来袭!个人完全免费使用谷歌Gemini大模型 超越DeepSeek几乎是地表最强
|
6月前
|
人工智能 Java API
Spring AI 实战|Spring AI入门之DeepSeek调用
本文介绍了Spring AI框架如何帮助Java开发者轻松集成和使用大模型API。文章从Spring AI的初探开始,探讨了其核心能力及应用场景,包括手动与自动发起请求、流式响应实现打字机效果,以及兼容不同AI服务(如DeepSeek、通义千问)的方法。同时,还详细讲解了如何在生产环境中添加监控以优化性能和成本管理。通过Spring AI,开发者可以简化大模型调用流程,降低复杂度,为企业智能应用开发提供强大支持。最后,文章展望了Spring AI在未来AI时代的重要作用,鼓励开发者积极拥抱这一技术变革。
2535 71
Spring AI 实战|Spring AI入门之DeepSeek调用
|
3月前
|
人工智能 自然语言处理 Java
从青铜到王者,DeepSeek+Spring AI 搭建 RAG 知识库
本文介绍了基于RAG(检索增强生成)技术构建知识库的原理与实现方法。RAG通过结合检索与生成模型,提升大语言模型在问答任务中的准确性与相关性,有效缓解“幻觉”问题。文章还详细讲解了如何利用DeepSeek与SpringAI搭建高效RAG系统,并提供了完整的Java代码示例,帮助开发者快速实现文档处理、向量存储与智能问答功能。适用于智能客服、内容生成、辅助决策等多个场景。
1004 2
|
2月前
|
人工智能 安全
用DeepSeek当工作伙伴:解决文案卡壳、问题拆解,让AI帮你省时间
本文介绍了如何利用DeepSeek提升工作效率。重点分享了5个高频工作场景的应用:1)快速处理文档提炼;2)突破创意卡壳;3)拆解复杂问题;4)快速学习专业知识;5)优化商务表达。同时提供了3个实用技巧:整理实际信息、优化提示词、学会追问补充。最后强调DeepSeek的核心价值在于解放精力,让用户专注于更具创造性和判断力的工作。通过合理使用,可显著提升工作效率和思维质量。
112 0
|
4月前
|
机器学习/深度学习 人工智能 文字识别
浏览器AI模型插件下载,支持chatgpt、claude、grok、gemini、DeepSeek等顶尖AI模型!
极客侧边栏是一款浏览器插件,集成ChatGPT、Claude、Grok、Gemini等全球顶尖AI模型,支持网页提问、文档分析、图片生成、智能截图、内容总结等功能。无需切换页面,办公写作效率倍增。内置书签云同步与智能整理功能,管理更高效。跨平台使用,安全便捷,是AI时代必备工具!
373 8
|
5月前
|
人工智能 自然语言处理 数据可视化
DeepSeek+Coze:普通人也能轻松搭建AI智能体的完整指南优雅草卓伊凡
DeepSeek+Coze:普通人也能轻松搭建AI智能体的完整指南优雅草卓伊凡
1846 1
DeepSeek+Coze:普通人也能轻松搭建AI智能体的完整指南优雅草卓伊凡