Home Lex Fridman Notes
Lex Fridman · 2020-12-13 · 1h 56m

Michael Littman: Reinforcement Learning and the Future of AI | Lex Fridman Podcast #144

Brown professor Michael Littman on reinforcement learning, why he is unmoved by AI doom, AlphaGo, and the social subtlety of self-driving cars.

Michael Littman: Reinforcement Learning and the Future of AI | Lex Fridman Podcast #144
The guest

Michael Littman — Computer science professor at Brown University researching and teaching machine learning, reinforcement learning, and AI. A prolific maker of computer-science parody songs and a former TurboTax commercial cameo, he has had a front-row seat to RL's history since the 1980s.

The gist

Lex Fridman talks with reinforcement learning researcher Michael Littman in a lighthearted, wide-ranging conversation. They open with sci-fi, music taste, Littman's parody-song hobby, and his TurboTax commercial before diving into AI. Littman lays out why he is skeptical of the superintelligence existential-threat argument, arguing we will learn to control these systems as we build them. They trace the history of reinforcement learning from TD-Gammon to AlphaGo and AlphaZero, debate the limits of self-play and language models like GPT-3, and explore why driving is a surprisingly social problem. The episode closes with book recommendations and Littman's view that the meaning of life is balance.

Big reveals

  • Littman admits a strong opinion: he is 'not particularly moved' by the idea that we will accidentally create a superintelligence that destroys humanity.
  • He reveals he once wrote an op-ed pushing back on Elon Musk's AI warnings, arguing Musk's belief in the power of ideas is both his strength and his blind spot.
  • Littman estimates humanity may have had a 30-40 percent chance of destroying itself with nuclear weapons in the 20th century.
  • He confesses that from ages 13 to 15 he has almost no memories, having spent those years alone in his room programming his TRS-80.
  • He admits that generation after generation of his students failed to replicate TD-Gammon's results, concluding Gerald Tesauro is a 'neural net whisperer.'
  • Littman says AlphaGo impressed him more than AlphaZero, disagreeing with colleague Satinder Singh who found the no-human-data version more breathtaking.
  • Littman reveals he barely reads books and once joked he got into college on a 'help out the illiterate' program.
  • Watching his son learn to drive revealed to him that driving is fundamentally a social-interaction activity, a blind spot in self-driving research.

Things worth remembering

  • In 2011 Littman started listening to the weekly Billboard top 10 on the treadmill and discovered he has 'no musical taste'—he just likes whatever he has heard most recently.
  • On his TurboTax shoot, ~50 people filled one room, including staff whose only job was holding sun filters and another to keep him from getting lost.
  • Littman wrote a parody song about the halting problem set to Billy Joel's 'Piano Man,' prizing its internal rhymes.
  • He worked at Bellcore with Dave Ackley, first author of the Boltzmann machine paper, the first neural net that could handle XOR.
  • Littman essentially tried to reinvent reinforcement learning before Dave Ackley handed him Rich Sutton's 1984 TD paper and arranged for Sutton to visit.
  • Littman used the term 'self-play' in his 1996 PhD dissertation; the term 'rollout' comes from backgammon via TD-Gammon.
  • In top-tier chess, teams of humans working with computer programs can still beat the best standalone programs, though that gap is asymptoting.
  • A 'counter Moore's law' exists: development cost per chip generation also doubles, so development money per cycle stays roughly constant.
  • Littman frames Moore's law not as one exponential but as hundreds or thousands of S-curves stacked on top of each other.
  • For his 42nd birthday Littman threw a 'meaning of life' party with slide presentations; his answer was 'balance,' demonstrated with a unicycle and a RipStik.

Recommended in this episode

Books, products and media the guest or host genuinely endorsed here — with the buy link.

Affiliate link — we may earn a commission at no extra cost to you.

RecommendedMedia

Robot & Frank

Jake Schreier (inferred)

“there's a movie called robot and frank which i think is really interesting because it's very near-term future” — Michael Littman 00:02:37
Find it on Amazon
RecommendedProduct

Kinesis keyboard

Kinesis

“it's a kinesis keyboard which is uh this butt shaped keyboard yes i've seen them yeah they're very uh i don't know sexy elegant” — Lex Fridman 00:57:06
Find it on Amazon
RecommendedBook

Program or Be Programmed

Douglas Rushkoff

“i find myself thinking of program or be programmed a lot by douglas roshkopf um which was it basically put out the premise” — Michael Littman 01:47:03
Find it on Amazon
RecommendedBook

Human Compatible

Stuart Russell

“i think i think stewart's book did a remarkably good job like a just a celebratory good job at describing ai technology and sort of how it works” — Michael Littman 01:49:06
Find it on Amazon
RecommendedBook

Exhalation

Ted Chiang

“one sci-fi book to recommend is exhalations by ted chang a bunch of short stories” — Michael Littman 01:51:08
Find it on Amazon