Yann LeCun: Deep Learning, ConvNets, and Self-Supervised Learning

The guest

Yann LeCun — Turing Award winner, founding father of convolutional neural networks, NYU professor and VP/Chief AI Scientist at Facebook

The gist

Yann LeCun discusses the philosophy and future of artificial intelligence, opening with value misalignment via 2001: A Space Odyssey's HAL 9000 and the parallel between objective functions and human legal codes. He explains why huge over-parameterized neural nets defy classical textbook wisdom yet still work, and argues that intelligence is inseparable from learning. A central theme is that reasoning requires world models, working memory, and energy-minimization-based planning rather than brittle logic graphs. LeCun makes the case that human intelligence is actually highly specialized rather than general, and that self-supervised learning, learning models of the world by observation like babies, is the key missing piece toward more capable machines.

Big reveals

LeCun says the most surprising empirical fact in deep learning is that gigantic over-parameterized neural nets trained on relatively small data actually work, breaking every pre-deep-learning textbook rule.
00:07:50
He asserts neural networks can definitely be made to reason; the open question is how much prior structure to build in.
00:11:01
He explains why neural nets fell out of favor around 1995: hard to implement back-prop without Python/MATLAB, easy beginner mistakes, and AT&T lawyers blocking open-source release.
00:25:05
LeCun reveals he and his team wrote their own Lisp interpreter and later a compiler to build the LeNet convolutional net character-recognition system at Bell Labs.
00:27:39
He argues human intelligence is not general but highly specialized, using a thought experiment about randomly permuting optical nerve fibers.
00:38:42
He states self-supervised learning is the only thing he is currently interested in and the path forward, not unsupervised or active learning.
00:44:56
He notes today's RL would need millions of driving hours, killing thousands of pedestrians and running off cliffs repeatedly, while humans learn to drive in 20-30 hours via world models.
00:53:53
LeCun claims there will be no human-level intelligence without emotions, which arise from predicting future contentment or threat.
01:13:12

Things worth remembering

The decorations in the recording room are all pictures from 2001: A Space Odyssey, placed by design.
00:04:13
LeCun describes three brain memory types: ~20-second cortical state memory, the longer-term hippocampus, and long-term synaptic memory.
00:13:38
A then-recent paper by Léon Bottou and others addressed getting neural nets to pay attention to real causal relationships, possibly solving data bias.
00:21:26
A patent on convolutional networks at Bell Labs expired in 2007; LeCun spent 2002-2007 hoping nobody at NCR would notice he resumed the work.
00:29:14
Self-supervised learning works for NLP because uncertainty is easy to represent (a probability vector over ~100,000 words), but hard for images and video.
00:48:34
Model-free RL needs about 80 hours to reach a level humans reach in 15 minutes at Atari games.
00:52:48
AlphaStar's StarCraft training is equivalent to about 200 years of self-play.
00:53:21
Babies learn to distinguish animate from inanimate objects at 2-3 months, object stability around 4 months, and gravity around 8-9 months.
01:04:47
LeCun describes three ways an agent can be stupid: wrong world model, misaligned objective (a psychopath), or inability to plan a good course of action.
01:08:27
He cites the Winograd schema (trophy/suitcase) as an example of common-sense reasoning requiring grounded world knowledge.
01:11:38

Topics

deep learning self-supervised learning convolutional neural networks reasoning and world models AGI and intelligence reinforcement learning autonomous driving AI history

Yann LeCun: Deep Learning, ConvNets, and Self-Supervised Learning | Lex Fridman Podcast #36

The gist

Big reveals

Things worth remembering

Topics