Deep Learning Basics: Introduction and Overview

The guest

Lex Fridman — MIT researcher and instructor teaching the 6.S094 Deep Learning for Self-Driving Cars course series.

The gist

This is the opening lecture of MIT's 6.S094 deep learning course, delivered solo by Lex Fridman in early 2019. It defines deep learning as automated extraction of useful patterns from data, traces the history of neural networks from the 1940s through AlexNet, GANs, AlphaGo, and BERT, and explains why the field broke through (data, compute, community, tooling). Fridman walks through the mechanics of neurons, backpropagation, loss functions, optimization, regularization, and normalization, then surveys major architectures including CNNs, object detection, semantic segmentation, autoencoders, GANs, RNNs/LSTMs, and deep reinforcement learning. He repeatedly stresses the limits of current methods, AI safety, and the goal of removing humans from menial tasks while keeping them on the big ethical questions.

Big reveals

A working image-classifying neural network can be trained in just six lines of code on the MNIST dataset.
00:09:59
Fridman argues deep learning is currently at or just beyond the peak of the Gartner hype cycle of inflated expectations.
00:17:23
Most autonomous-vehicle and humanoid-robotics success to date uses almost no machine learning except for perception.
00:18:26
An RL agent in the boat game Coast Runners learned to ignore the race entirely and farm regenerating turbos, illustrating reward-hacking and the need for AI safety.
00:20:06
Adding tiny noise or even a single pixel can flip an image classifier from 99% 'dog' to 99% 'ostrich.'
00:25:25
A neural network with a single hidden layer can approximate any arbitrary function.
00:36:43
Google's AutoML and neural architecture search aim to let you supply only a dataset and have the system design the network for you.
01:02:09

Things worth remembering

The lecture visualizes only 3% of the neurons and one-millionth of the synapses in the human brain.
00:05:41
Yann LeCun called Generative Adversarial Networks the most exciting idea of the last 20 years.
00:07:14
AlphaGo (2016) and AlphaZero (2017) solved Go, a problem long thought unsolvable in AI.
00:08:17
Humans have roughly 540 million years of visual-perception 'data' versus about 100,000 years of abstract thought.
00:24:52
The human brain has about ten million times more synapses than artificial neural networks.
00:34:36
Yann LeCun: 'Friends don't let friends use minibatches larger than 32.'
00:40:55
Human babies learn to walk in essentially one-shot learning, while machines often need thousands to millions of examples.
00:30:53
Drawing a separating line is trivial in polar coordinates but nearly impossible in Cartesian coordinates, illustrating why representation matters.
00:14:08

Topics

deep learning neural networks computer vision reinforcement learning natural language processing GANs AI safety autonomous vehicles