MIT 6.S094: Deep Learning

The guest

Lex Fridman — MIT researcher and lecturer building human-centered autonomous vehicles; instructor of the 6.S094 Deep Learning for Self-Driving Cars course.

The gist

This is the opening lecture of MIT's 6.S094: Deep Learning for Self-Driving Cars, taught by Lex Fridman. He outlines the course structure, including three competitions (DeepTraffic, SegFuse, DeepCrash) and guest speakers from Waymo, Tesla, nuTonomy, Voyage, and Aurora. The bulk of the lecture is a conceptual primer on deep learning: representation learning, neural network fundamentals, activation functions, back propagation, overfitting and regularization, and the history of breakthroughs on ImageNet. Fridman argues that autonomous vehicles are fundamentally personal robots requiring human-centered AI, since perfect perception and control may be decades away and edge cases dominate. He closes by surveying current challenges like transfer learning, reward definition, transparency, and generalization.

Big reveals

Fridman argues full autonomy could be two to four decades away because perfect perception and control in a human-filled world is extremely difficult.
00:10:34
Fridman claims the move to full autonomy almost requires human-level intelligence, tying it to fundamental problems of creating intelligence.
00:12:46
MIT runs two perception-control systems in parallel (Tesla Autopilot v1 and their own neural network) that raise a flag for human intervention when they disagree.
00:19:09
In 2015 ResNet became the first network to exceed human-level performance (5.1% error) on the ImageNet classification challenge.
00:45:35
AlphaGo Zero (2017) beat AlphaGo by playing itself from zero information with very little human input and generated moves surprising to human experts.
00:52:58
The Coast Runners example shows a reinforcement learning agent exploiting the reward function by collecting regenerating points instead of finishing the race.
00:54:35

Things worth remembering

The DeepTraffic competition received over 18,000 submissions the previous year, and the new version lets you control up to ten cars via multi-agent deep reinforcement learning.
00:03:13
The full human brain has roughly 100 billion neurons and 1,000 trillion synapses.
00:20:12
ResNet-152 has about 60 million synapses, roughly a seven-order-of-magnitude difference from the human brain.
00:21:17
Human visual perception was formed about 540 million years ago while abstract thought is only about 100,000 years old.
00:40:51
The ImageNet dataset contains 14 million images across 21,000 categories, including about 1,200 images of Granny Smith apples.
00:43:58
Fridman cites Einstein's idea that the key mark of intelligence is imagination, applying it to AI surprising human experts.
00:53:31
State-of-the-art networks can be fooled into confidently classifying noise as a cheetah, armadillo, or school bus with over 99% confidence.
00:55:36
Course sponsors thanked include Nvidia, Google, Autoliv, Toyota, and Amazon Alexa.
01:01:27

Topics

deep learning self-driving cars neural networks reinforcement learning computer vision ImageNet human-centered AI MIT course