
We were staring at a folder containing 24 files. They had names like a78-09.12, a78-05-1.52, and 68705.bin. The largest was 64 kilobytes. The smallest, 115 bytes. No headers, no metadata, no README. Just raw binary, dumped from the circuit board of a 1986 Taito arcade cabinet.
These were the ROM files for Bubble Bobble, one of the most beloved arcade games ever made. Specifically, this was the bub68705 variant, a set with its own fascinating history. And we asked Claude, Anthropic's AI, to tell us everything these bytes contained.
What came back was not a summary. It was a complete reverse-engineering of every system, every sprite, every level, every secret hidden in the game. The kind of analysis that once required specialized tools, deep expertise, and months of painstaking work.
It took an afternoon.

Bubble Bobble's hardware is surprisingly complex for an 80s arcade game. The bub68705 ROM set reveals a multi-processor architecture: four Z80 CPUs handling game logic, slave processing, sound control, and additional audio, plus an MC68705P5 microcontroller dedicated entirely to copy protection.
That microcontroller is the historically interesting part. Taito's original protection scheme was considered uncrackable. The MCU handled a cryptographic handshake at boot, and without the correct response sequence, the game refused to run. The bub68705 variant exists because someone, decades ago, successfully reverse-engineered that handshake. It required physically decapping the chip, tracing its internal logic, and reimplementing the protection routine on a different microcontroller. It was a landmark achievement in the arcade preservation community.
We fed all 24 binary files to Claude. No special tools, no plugins, no decompiler integration. Just the raw bytes and a request: analyze everything.
The analysis came in layers, each one deeper than the last.
System architecture. Claude mapped the complete memory layout of all four Z80 processors, identifying shared RAM regions, I/O ports, bank-switching mechanisms, and inter-processor communication protocols. It found the reset vectors, interrupt handlers, and main loop entry points for each CPU.
The protection system. The MC68705 microcontroller's bootstrap sequence, port assignments, and handshake protocol were fully documented. Claude traced the exact byte-level exchange between the main CPU and the protection MCU, explaining how the game validates authenticity at startup.
Every sprite in the game. This one stopped us cold. The graphics data was encoded in a planar 4-bit-per-pixel format with a twist: every byte was XOR'd with 0xFF (a Taito convention called ROMREGION_INVERT). Claude identified the encoding, reversed the inversion, decoded the planar format, and extracted 12,288 tiles. From those tiles, it assembled 2,701 individual 16x16 sprites: Bub and Bob in every animation frame, all enemy types, every item and power-up, the complete font set, background elements, UI components. Everything.
The audio system. Two FM synthesis chips, a Yamaha YM2203 and a YM3526, driven by a dedicated sound CPU. Claude mapped the sound command interface, identifying how the main game triggers music and effects through a shared memory location.
Complete gameplay mechanics. Bubble physics, enemy behavior patterns, item spawn conditions, scoring rules, the conditions for the game's two different endings. All extracted from the binary logic.
Easter eggs and secrets. The hidden power-up codes. The developer messages embedded in the ROM. The conditions for reaching the true ending (both players must be present). The secret items that appear only under obscure circumstances.
But the most impressive feat was what came next.

Bubble Bobble has 100 unique levels, each a single-screen playfield of platforms, walls, and open space where bubbles drift according to invisible wind currents. These level designs are the soul of the game, hand-crafted puzzles that escalate in complexity and creativity. Extracting them from the binary turned out to be the hardest problem in the entire analysis.
The first attempts went down the wrong path entirely. Claude initially searched the main CPU ROM for level data, trying to interpret Z80 machine code as tile maps. The patterns didn't match. The data wasn't there.
Then came the breakthrough: the level data was split across two different ROMs.
The 32x24 platform grid for each level was packed into the slave CPU ROM (4.bin) starting at offset 0x0CFC, encoded at just 3 bits per cell. Each cell could be one of five values: 0 for a solid block, 1 for upward flow (where bubbles rise), 2 for rightward wind, 3 for leftward wind, and 4 for downward flow. An entire level's geometry compressed into roughly 100 bytes.
The level metadata, monster placements, palette selections, scroll speeds, timers, and difficulty parameters, lived in a completely different ROM (a78-05-1.52) at offset 0x673A, packed into 43-byte bit-packed structures. Each structure defined not just what a level looked like, but how it behaved: which enemies spawned where, how fast they moved, what color scheme to use.
Claude found both halves, figured out how they linked together, decoded the bit-packing schemes, and rendered all 100 levels as images with correct palette colors and monster positions.

The results are stunning. Round 1's classic three-platform layout, the design that introduced millions of players to the game, is immediately recognizable.

Round 3 shows the nested box design that taught players about bubble-riding mechanics.
By Round 50, the designs have become elaborate, asymmetric constructions that demand mastery of every game system.

And here is where it gets delightful. The level designers at Taito used their 32x24 grids as a canvas. Many levels spell out words: "BUBBLE", "POPCORN", "JUMP!", "SOS!!", "BONUS", "OUCH!!", "HI-TECH!", "DRUNK!!", "RUN AWAY!!", "SUPER GAMER!!". Others form pictures: hearts, skulls, Pac-Man ghosts, Space Invaders aliens, butterflies, snowflakes. Each one a tiny piece of pixel art hidden inside a puzzle game, invisible to anyone who only played and never looked at the raw data.
Forty years of secrets, encoded in 3 bits per cell, waiting for someone, or something, to read them.
Let's step back and consider what actually happened here.
A conversational AI was given raw binary files with no context beyond their filenames. Through iterative analysis and reasoning, it produced a complete reverse-engineering of a commercial software product: architecture documentation, full disassembly, decoded graphics, extracted level data, gameplay mechanics, audio system mapping, and hidden content discovery.
No IDA Pro. No Ghidra. No specialized reverse-engineering tools. No hex editor, even. Just an AI reading bytes and thinking about what they meant.
The process that took the original bub68705 bootleggers months or years of work, involving hardware probing, logic analyzer traces, oscilloscope readings, and physical chip decapping, was replicated in an afternoon conversation. Not perfectly replicated. Not identically. But the end result, a fully understood and documented system, was the same.
And here is the part that should make every software professional pause: this is not limited to retro arcade games.
Every compiled binary. Every firmware blob. Every proprietary protocol. Every embedded system. They are all made of the same thing: bytes that encode logic and data according to knowable conventions. If an AI can decode a 40-year-old arcade game from raw ROM dumps, it can read your shipping software just as easily. Probably more easily, since modern binaries contain far more structural hints than a bare Z80 ROM.
Preservation gets a massive upgrade. Thousands of games, applications, and entire operating systems exist only as compiled binaries. Their source code is lost, their developers unreachable, their documentation nonexistent. AI can now decode these artifacts, document their behavior, extract their assets, and make them understandable again. Digital history just became recoverable at scale.
Security through obscurity is officially dead. It was always a bad idea. Now it is not even a viable bad idea. Every binary ships with its secrets visible to an AI that can read machine code the way you read English. Hardcoded encryption keys, proprietary algorithms, undocumented backdoors, hidden authentication bypasses: none of these survive contact with an AI that can reason about binary structure. If your security model depends on nobody understanding your compiled code, your security model is already broken.
Legacy system migration becomes tractable. There are banking mainframes running COBOL binaries that nobody alive fully understands. Industrial controllers executing firmware written by engineers who retired decades ago. Medical devices running code that has never been documented. These systems are black boxes that organizations are terrified to touch. AI-driven binary analysis offers a path to understanding them, documenting them, and eventually migrating them to modern platforms. Problems that seemed permanently intractable are now merely difficult.
Intellectual property faces hard questions. If any compiled binary can be fully understood and documented by an AI in an afternoon, what does "trade secret" mean for software? The legal framework around software IP was built on the assumption that compilation creates a meaningful barrier to understanding. That assumption needs revisiting. The business implications, for licensing, for competitive intelligence, for patent enforcement, are significant and largely unexplored.
The era of binary opacity is over. For decades, we have treated compiled software as a black box. You put source code in, you get an executable out, and the executable is functionally opaque to anyone without extraordinary skill and tools. That assumption is now invalid. Every compiled binary is, effectively, readable. Not open source in the legal sense, but open to understanding in the practical sense. The distinction between "source available" and "binary only" is becoming a legal distinction rather than a technical one.
For example, ever wondered what the logic was behind the type of food awards? Here it is:

Twenty-four files in a folder. Cryptic names. No documentation. Just raw bytes dumped from a circuit board that was manufactured when the Chernobyl disaster was still fresh news.
Those bytes held an entire world. A hundred hand-crafted levels, each one a small work of art. A cast of characters, from the heroic dragons Bub and Bob to the villainous Super Drunk. A two-player cooperative story about friendship, perseverance, and breaking curses. Easter eggs planted by developers who assumed no one would ever find them. An elaborate copy-protection scheme that represented someone's life's work to design, and someone else's life's work to defeat.
All of it was sitting there in the binary. Patient. Complete. Waiting to be read.
Now we have something that can read it.
The question is not whether AI will change how we think about compiled software. It already has. The question is what we do with a world where no binary is a black box anymore, where every piece of compiled code is a conversation waiting to happen.
Those 24 files from 1986 were just the beginning.
Источник: https://kotrotsos.medium.com/we-pointed-an-ai-at-raw-binary-files-from-1986-662ba30120f3