MIT AGI: Building machines that see, learn, and think like people (Josh Tenenbaum)

The guest

Josh Tenenbaum — MIT professor leading the Computational Cognitive Science group, affiliated with Brain and Cognitive Sciences, CSAIL, and the Center for Brains, Minds and Machines (CBMM).

The gist

In this MIT AGI lecture hosted by Lex Fridman, Josh Tenenbaum contends that today's AI systems are powerful 'AI technologies' for pattern recognition but lack common sense and true general intelligence. He argues the most promising path to AGI is reverse-engineering how the human mind and brain build models of the world, doing cognitive science 'like an engineer.' He surveys visual intelligence, the limits of image-captioning systems, and how infants and even animals demonstrate physical reasoning, object permanence, planning, and spontaneous helping that machines cannot replicate. He presents technical tools his group uses, including probabilistic programs, a 'game engine in the head' for intuitive physics and psychology, and program-learning approaches like one-shot character learning. He closes with discussion on industry versus academia, emotions, neural circuits, hardware, and energy efficiency.

Big reveals

Tenenbaum claims 'we don't really have any real AI' — only AI technologies that do single tasks without common sense or general intelligence.
00:02:06
Most foundational deep learning and reinforcement learning papers (backprop, perceptron, temporal difference) were originally published in psychology and cognitive science journals.
00:08:47
Andrej Karpathy emailed Tenenbaum saying that despite five years passing, 'I don't believe we've made very much progress' on real image understanding.
00:26:50
The core CBMM goal: build a robot that can spontaneously help humans around the house the way an 18-month-old child does, without being programmed or instructed.
00:43:50
Tenenbaum proposes that evolution built game-engine-like tools (approximate physics and simple agent planning) into the human brain as the cognitive core.
00:51:38
His group's 2015 Science cover paper achieved human-level one-shot concept learning of handwritten characters by inverting probabilistic 'drawing programs.'
01:09:42
He frames the oldest dream of AI — Turing's and Minsky's — as building a machine that starts like a baby and learns like a child, which he believes can be worked on now.
01:13:49

Things worth remembering

The high-resolution central region of human vision (the fovea) is only about the size of your thumb held at arm's length, yet the brain stitches glimpses into a rich whole-world representation.
00:12:24
People can estimate how far the wall behind them is and roughly how many people are behind them despite not having looked, showing the brain tracks the unseen world.
00:13:26
Industry image-captioning systems surpassed human accuracy on held-out test sets but suffer 'dataset overfitting,' producing absurd captions on random web images.
00:21:08
A baby reaching behind himself for an unseen cup demonstrates 'object permanence' — representing objects as enduring even when out of sight for a minute or more.
00:36:35
In Warneken and Tomasello's experiments, human 18-month-olds spontaneously help adults in novel situations, while chimps rarely do so reliably or flexibly.
00:40:43
Tenenbaum's 'intuitive physics engine' predicts human judgments about falling blocks by running a few short, low-precision game-style physics simulations.
00:52:09
Infant knowledge is studied via 'looking time' methods — babies look longer at surprising scenes — and his model linked low probability events to longer looking times.
00:57:17
A model using the MuJoCo robotics physics engine plus Bayesian inverse planning predicts which object a person is reaching for before they touch it, matching human timing.
01:01:26
Neurons are slow yet the brain computes intelligent behavior fast on almost no energy — a burrito powered a student writing 300 lines of code.
01:25:41
Tenenbaum speculates a single biological neuron may be as computationally powerful as a CPU core, implying the brain could be like ~10 billion cores.
01:29:20

Topics

artificial general intelligence cognitive science probabilistic programming intuitive physics child development visual intelligence deep learning limits reverse engineering the brain