An Introduction to Learning Through Simulation
Chief Executive Officer
Chief Technology Officer at Yoyo
Payment facilitator for small businesses
Video Game Development Engineer at Playful, Ensemble, Newtoy (Zynga with Friends)
AOE, Lucky's tale, Creativerse, Oculus SDK (v0 through v1)
Minor OSS contributor
DevIL, ResIL, fog, Emscripten (Web Assembly)
An internal simulator that learns to predict what happens next
| Concept | Learns From | Output |
|---|---|---|
| Reinforcement Learning | Experience | Policy/Value |
| Generative AI | Data | Image, video, text |
| World Model | Env. signals | Internal simulation |
Captures state, transitions, and visual output
Q‑Learning (Watkins & Dayan, 1992) — “Q-Learning”, Machine Learning, 8(3–4):279-292. This paper proves convergence of the Q-learning algorithm under appropriate conditions.
Playing Atari with Deep Reinforcement Learning (Mnih et al., 2013) — Introduces the DQN algorithm (a deep neural-network variant of Q-learning) that can learn control policies directly from raw pixels.
Deep Reinforcement Learning with Double Q‑Learning (van Hasselt et al., 2016) — A deep RL variant addressing over-estimation in Q-learning via Double Q-learning.
Dreaming is a safer, faster path to learning
why simulate?
Slow (robot training)
Risky (self-driving)
Expensive (physical trials)
Fast iterations
Completely safe
Cost-effective
| Year | Milestone |
|---|---|
| 1990s | Early world model concepts (Schmidhuber) |
| 2018 | Ha & Schmidhuber: 'World Models' paper |
| 2024 | DeepMind Genie v1: 2D dream worlds |
| 2025 | Genie v3: Real-time 3D environments |
Understanding the Basics
Observation
Encoder
Latent State
Decoder
Key Components
Like RL agents, world models improve over time