9 min read
9 min read

Artificial intelligence has started doing something unexpected: playing Pokémon Red and Blue on live streams, with thousands of people watching every move. Instead of using cheat codes or rigid algorithms, these AI systems are figuring things out on the fly, like a beginner picking up the game.
What keeps people glued is the strange mix of nostalgia and novelty. We know the game inside out, but watching a machine struggle with a simple door or win a battle by accident makes it feel new again. It’s like watching a toddler learn to walk—awkward, funny, and weirdly inspiring.

Pokémon Red and Blue were some of the most popular Game Boy games ever made. Now, they’re making a comeback most unexpectedly.
AI bots are trying to beat these classic games live, and fans are hooked on every move. Watching them play is like seeing the game through brand-new eyes.

Pokémon Red and Blue are simple by today’s gaming standards, but that simplicity hides layers of complexity that make them ideal AI challenges. The mechanics involve exploration, battle tactics, memory, and long-term decision-making.
Because the game doesn’t hold your hand, the AI has to work through real challenges: finding keys, backtracking, experimenting, and interpreting visual clues. It’s not just about “winning” the game; it’s about solving a thousand small problems along the way.

Claude and Gemini are language models created by Anthropic and Google DeepMind. Usually, they’re used to answer questions, write code, or summarize text, but now they’ve been dropped into a Game Boy world to see what they can do.
What’s impressive is that these AIs aren’t given special advantages. They don’t “know” the rules ahead of time. They interpret on-screen information and generate button presses based on reasoning, aided by structured inputs and tools known as agent harnesses.

Beneath the colorful sprites and simple mechanics, Pokémon is a surprisingly deep environment for problem-solving. You manage resources, pick optimal battle strategies, navigate mazes, and make tough choices without much guidance.
For an AI, that’s gold. It means the system can’t rely on scripts or brute force. It has to observe, guess, test, and learn from feedback. That’s exactly what AI developers want: real-world complexity in a tightly scoped game.

Watching an AI struggle with a simple puzzle or fumble through a battle creates a strange sense of empathy. Viewers tune in expecting a cold machine and end up rooting for a digital underdog. There’s drama in the uncertainty, joy in small victories, and lots of unintentional comedy.
Twitch chat adds to the vibe, turning the experience into a shared journey. People cheer, groan, and laugh together as the AI gets stuck, then suddenly pulls off something brilliant. The AI doesn’t have a personality, but it develops one in the minds of its viewers. That’s what makes it work.

These AI models, like Claude and Gemini, aren’t loaded with game rules or walkthroughs. They take in images of the game screen, interpret what’s happening, and choose controller inputs like “press A” or “go left.”
Each success is earned. Catching a new Pokémon, escaping a cave, or buying the right item all require inference and planning. Watching the AI reach those moments is strangely satisfying. You’re not watching something programmed, you’re watching something learn.

Claude has earned three badges so far, though differences in their respective setups make direct performance comparisons with Gemini challenging. It often loops back on itself, forgets what it was doing, or tries the same failed path repeatedly. But then, out of nowhere, it’ll solve something brilliantly, and that unpredictability is what makes it engaging.
The charm is in the effort. Claude isn’t optimizing for speed, he’s stumbling toward understanding. Every small achievement feels big, and every weird decision makes you wonder what’s going on inside its digital brain.

Gemini, by contrast, is moving fast. It’s already at Victory Road and on track to challenge the Elite Four. It makes quick decisions, trains effectively, and solves puzzles with confidence. That doesn’t mean it’s perfect; it still makes odd calls, but it recovers quickly and seems to learn faster.
Watching Gemini feels like seeing someone who finally “gets” the game. Its playstyle is tighter, more goal-driven, and surprisingly efficient. It’s the AI equivalent of a gifted student in class, less funny, maybe, but undeniably sharp.

Even though Claude and Gemini are both LLMs, other AI systems interact with the game in different ways. Some setups provide screen captures with OCR for text recognition, while others, like OpenAI’s GPT-based models, might rely on button prompts or structured game feedback.
This means comparing AIs head-to-head isn’t always fair. One might have cleaner visual inputs or better data interpretation, while another is working with more noise. But that’s part of the experiment: seeing how different systems handle complexity.

Pokémon provides more than a nostalgic challenge, it offers a new benchmark for evaluating AI flexibility. Instead of grading models on trivia or math, we’re watching them explore, adapt, and build strategies across hours of gameplay.
This approach could reshape how we think about AI performance. Games like Pokémon require memory, long-term planning, and improvisation. Measuring progress here gives developers a clearer picture of general intelligence in action.

Unlike pre-programmed bots or human speedruns, these AI playthroughs are full of surprises. Sometimes the model gets lucky and catches a rare Pokémon right away. Other times, it completely ignores something obvious and spends an hour wandering.
That randomness keeps viewers hooked. No one knows what’s coming next, not even the AI. Each session feels like a new experiment, where anything can happen. It makes the old game world feel alive again, as if it’s being rediscovered by a mind seeing it for the first time.

Watching a machine learn the same game you grew up with sparks something emotional. Twitch chat lights up during clutch wins or ridiculous failures. Viewers argue over strategies and root for their favorite AI like it’s a sports team.
It turns into a community event. People bond over watching Claude struggle with Rock Tunnel or Gemini make a risky battle choice. The game becomes a shared reference point again, not just nostalgia but an unfolding drama.

These AIs aren’t brute-forcing their way through the game or accessing hidden data. They’re learning the hard way, by trial, observation, and inference. That’s what makes it different from earlier “AI plays game” demos. There’s no shortcut here.
When Claude finally figures out how to beat a gym, or Gemini makes a clever switch mid-battle, it feels earned. You know the model has to see the situation, recall what it’s learned, and apply it.

Sometimes the AI does something strange, like choosing an underpowered move or walking in circles for five minutes. But then it works. These models aren’t thinking like humans. They’re testing, recombining, and inventing strategies we might never try.
That’s the magic. You’re watching an alien intelligence stumble into success, which gives every moment an edge of discovery. The AIs don’t care about “the right way”, they care about what works. And sometimes, that leads to creative, even brilliant results that no human would have planned.

This experiment proves something big: games like Pokémon aren’t just for fun, they’re powerful tools for testing general intelligence. If AI can handle this world, it can start tackling more complex ones. Developers are already eyeing other classics, Zelda, Mario, maybe even Minecraft.
For now, the charm lies in watching Claude and Gemini wrestle with potions and Pidgeys. But the bigger story is about what comes next. These models won’t stop here. Today, it’s Pokémon. Tomorrow, it might be navigating virtual cities or even building games of their own.
Curious how far this tech can go? See how Gemini 2 is already transforming business analysis.

Watching an AI play Pokémon isn’t just entertainment; it’s a window into how we might one day work alongside intelligent systems. These models are learning to interpret environments, make decisions, and navigate goals in open-ended scenarios.
This kind of game-based training could lead to AIs that assist in simulations, design planning, or even real-time troubleshooting. If a model can understand a dynamic system like a video game and adapt to its rules, it’s not a stretch to imagine it helping people solve problems in science, education.
Curious how AI is leveling up beyond games? Check out how Google Gemini is now creating AI podcasts, it’s wild what’s possible.
Which AI would you bet on, Claude’s slow-and-steady approach or Gemini’s speedy strategy? Drop your pick in the comments.
Read More From This Brand:
Don’t forget to follow us for more exclusive content right here on MSN.
This slideshow was made with AI assistance and human editing.
This content is exclusive for our subscribers.
Get instant FREE access to ALL of our articles.
Dan Mitchell has been in the computer industry for more than 25 years, getting started with computers at age 7 on an Apple II.
We appreciate you taking the time to share your feedback about this page with us.
Whether it's praise for something good, or ideas to improve something that
isn't quite right, we're excited to hear from you.
Stay up to date on all the latest tech, computing and smarter living. 100% FREE
Unsubscribe at any time. We hate spam too, don't worry.

Lucky you! This thread is empty,
which means you've got dibs on the first comment.
Go for it!