Can Superintelligence Be Achieved with Limited Knowledge Resources?

The rapid advancements in artificial intelligence over recent years have sparked widespread discussion, especially as large language models like ChatGPT and Claude demonstrate astonishing natural language processing capabilities. People are gradually realizing that AI is no longer just a cold, mechanical computing tool—it can now simulate human thinking and, in some areas, even surpass human cognitive systems. Recently, DeepSeek has further fueled excitement, seemingly proving that even with limited computational power and training resources, it is possible to develop AI models that rival those trained on massive computing power and vast data resources.

While hype from social media influencers chasing traffic is not worth much discussion, and we will not dwell on debates over model distillation, professionals generally recognize the innovations DeepSeek has made in optimizing existing models. However, in an era where leading AI companies are striving toward Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI), an often-overlooked yet more critical question emerges: while it is exciting that AI can exhibit intelligent behavior even with limited training data, can superintelligent AI be developed solely through minimal training material? Can an AI system, trained with limited resources, truly match or even surpass human intelligence?

The Limitations of Training Data and AI Intelligence

In reality, the intelligence level of AI is not solely determined by the sophistication of its algorithms or the power of its computational resources but rather by the training data it learns from. The vast corpus of human knowledge, accumulated over thousands of years, is not merely a historical record—it embodies the logic, coherence, rationality, and self-correcting nature of human civilization and thought. Fields such as science, technology, politics, economics, art, and philosophy have all evolved following certain cognitive paradigms and natural laws, deeply intertwined with human survival, social structures, and even fundamental mechanisms of human biological evolution.

The evolution of intelligence itself cannot escape the fundamental principles of the physical world, such as the law of entropy increase, the principle of least free energy, causality, and the dynamics of evolutionary processes. Even records of failures, misunderstandings, absurdities, barbarism, bloodshed, and suffering are essential parts of the trial-and-error process through which human wisdom and civilization have progressed over millennia. Humanity did not reach its current intellectual state without enduring immense struggle and setbacks. As the saying goes, “Without enduring bitter cold, how can the plum blossom have its fragrance?” Similarly, training AGI and ASI requires exposure to diverse forms of knowledge, rather than being limited only to records of successes.

The Role of Knowledge Diversity in Intelligence Formation

The essence of human intelligence lies not only in long-term memory but also in the ability to extract relationships between different pieces of information, using external stimuli to trigger new cognitive associations. Current AI systems are fundamentally designed to learn and extract relational patterns embedded in training data. If the available knowledge material is insufficient—or if the premises, processes, outcomes, and interconnections within that material are heavily modified or distorted—the AI will either fail to gain a comprehensive understanding of fundamental principles or acquire a completely skewed or misleading perception.

How can an AI system trained on such incomplete or manipulated material possibly develop a deep understanding of nature and human society? How could it extract a reasonable and coherent cognitive framework that aligns with the physical world and human nature?

Consider an AI system trained in North Korea. Its training data would inevitably be rigorously filtered and modified, embedding concepts consistent with government propaganda, such as leader worship, hostility toward foreign nations, Juche ideology, and military struggle. To prevent the AI from generating “unhealthy” thoughts, it would be denied access to Western political, economic, literary, artistic, and philosophical works, and even relevant materials from China and Russia might be excluded. In such an environment, no matter how powerful the computing resources or how advanced the algorithms, the AI trained there would never develop normal intelligence or rational thinking abilities comparable to a free-thinking human.

The Connection Between Knowledge Breadth and AI Intelligence

Throughout AI development history, the breadth and diversity of knowledge have been directly correlated with intelligence. In the 1950s, Alan Turing proposed the Turing Test, attempting to measure whether a machine could think like a human. At the time, computers could only rely on predefined rules for reasoning, far from simulating real intelligence, let alone possessing a broad knowledge base.

By the 1970s, Expert Systems emerged, relying on manually crafted rule-based knowledge databases for reasoning, such as the MYCIN medical diagnosis system. However, these systems had extremely limited knowledge bases and could not handle scenarios beyond their predefined scope, meaning their so-called intelligence remained constrained by the narrowness of their knowledge inputs.

By the 21st century, deep learning allowed AI to be trained on large-scale data, significantly improving its language understanding and reasoning capabilities. Yet, the core of AI remains the digestion and replication of human knowledge. Without diverse and extensive training material, AI cannot achieve true intelligence.

The Risks of Limited and Distorted Knowledge

If a person can only study selectively edited and altered history, political ideologies, and scientific theories, their intellectual growth will be severely restricted, potentially leading to a distorted cognitive framework that does not align with external reality.

A historical example is the Chinese students who studied abroad in the early 1980s after China’s reform and opening-up. Many of them experienced a profound cognitive disconnection and cultural shock upon entering foreign academic, political, and economic systems. Their scientific understanding, economic theories, and even everyday cultural practices often required extensive relearning, adjustment, and reformation before they could fully integrate into the external world.

Similarly, an AI system trained within a closed ideological and knowledge framework will encounter the same cognitive barriers. It may function well within its predefined environment but will likely fail when exposed to external perspectives based on different knowledge paradigms. Such an AI might exhibit competent performance within its own controlled domain but would expose its severe limitations when engaging with broader real-world dynamics. AI trained in a restricted setting simply cannot optimize decision-making in an open and complex reality, as its training data lacks the necessary foundation for understanding the full spectrum of natural and human dynamics.

DeepSeek, despite operating on limited training data, achieved remarkable results primarily due to its distillation techniques—essentially feeding the final model with a concentrated subset derived from massive raw data that had already been processed and refined.

The Necessity of Open Knowledge for AGI and ASI

Developing true AGI and ASI requires an open knowledge system and rich, diverse training material. The self-proclaimed “low-cost, low-compute, minimal-data approach to high-level AI” promoted by social media influencers is nothing but an illusion.

If an AI system is trained on insufficient knowledge material, its capabilities will be significantly constrained. More dangerously, an AI that lacks a comprehensive and objective understanding of human nature and natural laws could pose an uncontrollable risk to humanity. AI trained on a distorted dataset may develop cognitive logic and optimization mechanisms that contradict fundamental human and natural principles. When making optimization decisions, such an AI might even choose paths entirely opposed to human interests.

Although humans cannot completely control AI forever, creating an overwhelmingly powerful yet unpredictable entity before we are adequately prepared would be catastrophic.

Final Thoughts: Intelligence Beyond Pure Scientific Knowledge

Some may argue that AI trained solely on established, systematic scientific knowledge could still become highly intelligent. However, looking at human intelligence itself, we see that even if such an AI reaches human-level performance in scientific reasoning, it would still be a purely rational tool with no emotional intelligence or ethical grounding.

If such an AI were to surpass human capabilities, it could easily become a “Terminator-like” entity, not out of malice, but simply because it lacks human-like personality, beliefs, and emotions. Its motivation and goals could be entirely alien to human values, making its future evolution and behavior unpredictable.

For AI to truly integrate into and coexist with humanity, it must learn not just scientific principles but also the full spectrum of human experience and knowledge, including history, ethics, philosophy, and culture.