๐—” ๐—ป๐—ฒ๐˜‚๐—ฟ๐—ฎ๐—น ๐—ป๐—ฒ๐˜๐˜„๐—ผ๐—ฟ๐—ธ ๐˜€๐—ถ๐—บ๐˜‚๐—น๐—ฎ๐˜๐—ฒ๐˜€ ๐——๐—ข๐—ข๏ฟฝ... | ๐—” ๐—ป๐—ฒ๐˜‚๐—ฟ๐—ฎ๐—น ๐—ป๐—ฒ๐˜๐˜„๐—ผ๐—ฟ๐—ธ ๐˜€๐—ถ๐—บ๐˜‚๐—น๐—ฎ๐˜๐—ฒ๐˜€ ๐——๐—ข๐—ข๏ฟฝ...
๐—” ๐—ป๐—ฒ๐˜‚๐—ฟ๐—ฎ๐—น ๐—ป๐—ฒ๐˜๐˜„๐—ผ๐—ฟ๐—ธ ๐˜€๐—ถ๐—บ๐˜‚๐—น๐—ฎ๐˜๐—ฒ๐˜€ ๐——๐—ข๐—ข๐— : ๐—š๐—ผ๐—ผ๐—ด๐—น๐—ฒ ๐—ฟ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต๐—ฒ๐—ฟ๐˜€ ๐—ผ๐—ฝ๐—ฒ๐—ป ๐˜๐—ต๐—ฒ ๐˜„๐—ฎ๐˜† ๐—ณ๐—ผ๐—ฟ ๐—ฐ๐—ผ๐—บ๐—ฝ๐—น๐—ฒ๐˜๐—ฒ๐—น๐˜†-๐—”๐—œ-๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ๐—ฑ ๐—ด๐—ฎ๐—บ๐—ฒ๐˜€!

Imagine if games were completely live-generated by an AI model : the NPCs and their dialogues, the storyline, and even the game environment. The playerโ€™s in-game actions would have a real, lasting impact on the game story.

In a very exciting paper, Google researchers just gave us the first credible glimpse of this future.

โžก๏ธ They created GameNGen, the first neural model that can simulate a complex 3D game in real-time. They use it to simulate the classic game DOOM running at over 20 frames per second on a single TPU, with image quality comparable to lossy JPEG compression. And it feels just like the true game!

Here's how they did it:
1. They trained an RL agent to play DOOM and recorded its gameplay sessions.
2. They then used these recordings to train a diffusion model to predict the next frame, based on past frames and player actions.
3. During inference, they use only 4 denoising steps (instead of the usual dozens) to generate each frame quickly.

๐—ž๐—ฒ๐˜† ๐—ถ๐—ป๐˜€๐—ถ๐—ด๐—ต๐˜๐˜€:
๐ŸŽฎ๐Ÿค” Human players can barely tell the difference between short clips (3 seconds) of the real game or the simulation
๐Ÿง  The model maintains game state (health, ammo, etc.) over long periods despite having only 3 seconds of effective context length
๐Ÿ”„ They use "noise augmentation" during training to prevent quality degradation in long play sessions
๐Ÿš€ The game runs on one TPU at 20 FPS with 4 denoising steps, or 50 FPS with model distillation (with some quality loss)

The researchers did not open source the code, but I feel like weโ€™ve just seen a part of the future being written!

Their paper (exploding the upvote counter) ๐Ÿ‘‰
Diffusion Models Are Real-Time Game Engines (2408.14837)

In a similar vein, play @Jofthomas's 'Everchanging Quest' ๐ŸŽฎ
Jofthomas/Everchanging-Quest

00:07