陆首群评DeepSeek英文版在Hugging Face上发表了
附:Hugging face 发表的“陆首群评DeepSeek”(英文版):
Lu ShouQun,Honorary Chairman of China OSS (Open Source Software) Promotion Union (COPU)
Abstract
The China Open Source Software Promotion Union (COPU) has been tracking the evolution of global artificial intelligence (AI) technology for a long time, with a particular focus on the collaborative innovation of open source ecosystem and AI development. In response to the current dilemma of "high capital consumption and low technology accessibility" in the global AI competition, COPU believes that it is urgent to explore sustainable innovation paths. The DeepSeek AI model developed by Liang Wenfeng's team has achieved high performance while reducing training costs through algorithmic innovation (such as dynamic sparse training architecture) and open source business model innovation, which confirms China's breakthrough ability in AI methodology and technology, which is the core point of view in this article.
DeepSeek’s greatest success is that Liang Wenfeng’s team has developed a new path for developing AI with an innovative attitude: “low investment, low cost, limited resources, high efficiency, and high cost performance (output)”.
DeepSeek can be regarded as a representative work of China's current AI and is changing the development pattern of AI in the world. It lowered the bar for the public and enterprises globally to use AI. Open up a smooth road for emerging forces to develop AI. Negating the old path to develop AI with "huge investment, high cost, massive resources, low efficiency, and low cost performance (output)".
It is not an exaggeration to call Liang Wenfeng’s team a group of wizards or geniuses who have achieved “national destiny” innovation!
Liang Wenfeng's team insists on open source innovation. Open source helps the iterative innovation, stability and upgrade of AI, and the development of the ecosystem. DeepSeek integrates the full open source of the C-end of the large model and the implementation of an open source business model on the B-end. It not only implements open source innovation, but also supports the development of the open source industry. This is also a major creation of DeepSeek.
Some people use this to suppress DeepSeek by rating according to the current output product rankings. In fact, the output performance of DeepSeek and other large generative language models is on par with each other, and there is no exaggerated situation of one being higher than the other. If we compare them in a more scientific way based on cost-effectiveness, DeepSeek is definitely the best in the world.
Currently, there are not many secrets about DeepSeek’s key technology. Some large generative language models in China and abroad have basically learned DeepSeek’s key technology. When it comes to the next stage of AI competition, it can be said that everyone is on the same starting line.
The advent of DeepSeek has triggered a fierce competition in global AI.
The current DeepSeek model, like other large language models, are a generative autoregressive large language model. Limitations and negative defects exists in DeepSeek, affecting its performance. In DeepSeek’s development, it is important to overcome limitations, root out defects, greatly improve intelligence, save energy and increase efficiency, and expand applications.
For the generative autoregressive language model, since language cannot replace the real world, it lacks world knowledge, or cannot generate new knowledge to truly understand the physical world. In addition, language is not equal to thinking, and it also limits the depth of thinking during operation, which ultimately limits the level of intelligence produced. The autoregressive mechanism of the language model training architecture is based on Tokens and the signal processing and statistics it supports, which is the root cause of the hallucination.
DeepSeek, like other standard and inclusive base models, is difficult to directly transform into high-quality productivity for enterprises and industries. It still needs to improve its temporarily missing commercial value. They lack a deep understanding of enterprises and industries. While they are really applied in the business scenarios of enterprises and industries (such as finance, manufacturing, medical care, etc.), to generate value for enterprises and industries, they must capture the data of enterprises and industries and then apply them to fill the gaps.
It is suggested that an important task for DeepSeek’s development is to solve its problem of deviation and transition, and strive to win in the fierce global competition.
The goal of calibrating DeepSeek is to develop real and advanced AI - Artificial General Intelligence (AGI). When developing AGI, we must avoid being impatient for quick success. To achieve AGI, we must first develop the tasks of AI in the transition stage (such as multimodality, embodiment, agents and world models, etc.). AGI is an AI with an autonomous system. AGI is at a crossroads of whether AI intelligence can surpass humans. This is related to whether it affects human safety and even affects the extremely serious problem of whether humans can survive on the earth. When developing AGI to ensure preventive measures for human safety, it also requires countries around the world to take unified actions on the basis of mutual trust and implement the policy of combining technology and management (regulation). The task is extremely severe and arduous.
点击 “阅读原文” 字样,查看链接内容:
https://huggingface.co/blog/COPU2004/lu-shouquns-view-of-deepseek
IBM程海旭,COPU Cathy.J:欢迎大家阅读、转发、留言及讨论。