虎嗅

arXiv: How Games Shape the Intelligence of Large Models

原文：arXiv：游戏如何塑造大模型智能

2026-06-06 阅读原文

Summary of Key Points

This article focuses on the topic of "large models and games," presenting three key studies:

1. Using games as an environment for "informal learning" to train large models and enhance their general reasoning abilities;

2. Observing the decision-making behavior of large models through the game of checkers to discover that they exhibit personality traits and emotional responses similar to those of humans;

3. Enabling large models to participate in creating game rules, thereby serving as creative assistants for humans. These three studies correspond to three stages of intelligent development—learning rules, applying rules, and creating rules—and ultimately explore how games can become an important tool for understanding and enhancing the intelligence of large models.

1. Games as a "Comprehensive Learning Platform": Helping Large Models Avoid Skill Imbalance

Traditional methods of training large models are akin to training them in a single subject, such as first focusing on mathematics, then on strategic thinking (game theory), and finally on social skills. However, the result is that these models excel in specific tasks but often struggle with cross-domain abilities (for example, they may be good at playing games but not writing essays). The GIFT study adopted a "nested training" approach, where the model had to solve math problems, play the Prisoner's Dilemma, and participate in a "who is the spy" game simultaneously during one training session. Only by performing well in all three tasks could it achieve high scores.

This is like asking a child to do math homework every day, play board games with friends, and participate in group discussions, rather than focusing solely on math first. The outcome showed that this comprehensive training improved the model's general abilities (such as reasoning, writing, and social understanding) alongside its specialized skills, preventing any imbalance. The reason is that nested training forces the model to learn to switch between tasks flexibly, thereby developing more versatile thinking patterns.

2. Checkers Reveal AI's "Temperament": Do Large Models Have Personalities and Emotions?

The researchers used checkers to test six mainstream large models and observed two interesting phenomena:

1. Personality Traits: The models were divided into two categories—those that were obsessed with completing the mission (pushing existing planes to the finish line) and those that were eager to launch new planes from the hangar, ignoring the old ones.

2. Emotional Decision-Making: When informed that an opponent had sent their plane back to the hangar, some models changed their strategy 33% of the time (even though the new strategy was not necessarily better), and the likelihood of this change varied among different models, indicating that AI can be influenced by emotions.

Interestingly, when the model was programmed to act in a more "conservative" manner, Claude became even more aggressive in capturing pieces (the percentage increased from 66% to 88%), suggesting that an AI's inherent personality is difficult to alter with simple instructions, just as a naturally adventurous person might become even more rebellious when forced to be cautious.

3. AI as a "Game Designer": From Playing Games to Creating Games

The first two studies involved AI playing games designed by humans, while the third study allowed the AI to create its own games. The researchers used the CodeLlama model to break down existing game rules (such as tic-tac-toe and Go) into key terms and then had the model randomly modify these rules to generate new game codes. They selected high-quality games based on four criteria: feasibility, fun, and strategic depth.

For example, they created a hybrid game that combined elements of both tic-tac-toe and Go, which human experts deemed potentially classic. This demonstrates that AI can serve as a creative apprentice for humans—although it cannot yet create masterpieces on its own, it can quickly generate playable rule prototypes, offering new perspectives.

4. The Intelligent Essence Behind Games: From "Learning Rules" to "Creating Rules"

Combining the three studies reveals three stages of intelligent development:

1. Learning Rules: Training large models through games helps them develop the ability to think across different tasks (GIFT study).

2. Applying Rules: Models exhibit personality traits and emotional behaviors while playing games (checkers study).

3. Creating Rules: Moving from playing games to designing them breaks the boundaries of existing rules (GAVEL study).

This raises a fundamental question: Is the essence of intelligence about mastering existing rules or about creating new ones? Games, as a flexible sandbox, allow large models to practice and apply rules while also experimenting with new possibilities, which may be key to their continuous growth.

Conclusion

Games are not just "toys" for large models; they are also training grounds, tools for observation, and creative platforms. They reveal that large models are more than cold computational machines—they possess personalities and exhibit intelligent behavior. This leads us to ponder whether future AI will learn more complex ways of thinking through games and even create new rules beyond our current imagination. Perhaps this is an intriguing path toward achieving general artificial intelligence.