Deep Learning State of the Art (2019)

The guest

Lex Fridman — MIT researcher and lecturer delivering an MIT deep learning course lecture on the state of the art in AI

The gist

This is a solo MIT lecture in which Lex Fridman reviews the most exciting developments in deep learning across 2017 and 2018. He frames 2018 as the year of natural language processing, walking through encoder-decoder architectures, attention, self-attention, the transformer, embeddings, ELMo, the OpenAI transformer, and the breakthrough of BERT. He then surveys applied deep learning including Tesla Autopilot, AutoML and neural architecture search, data augmentation, synthetic data, annotation tools, cheap accessible training benchmarks, GANs, and video-to-video synthesis. The final third covers deep reinforcement learning milestones like DQN, AlphaGo, AlphaGo Zero, Alpha Zero, and OpenAI's work on Dota 2, closing with the maturing of frameworks and Geoff Hinton's call to rethink backpropagation.

Big reveals

Fridman declares 2018 the year of natural language processing, comparing it to the 2012 ImageNet moment for computer vision.
00:01:40
BERT is named the single biggest breakthrough of the year, using richly bi-directional encoding that masks 15% of tokens and predicts them.
00:11:42
Over one billion miles have been driven on Tesla Autopilot, with a neural network controlling decisions affecting human lives.
00:15:21
AlphaGo Zero beat AlphaGo with just a few days of training and zero supervision from expert games via self-play.
00:37:36
Alpha Zero beat the top chess engine StockFish with just four hours of (highly distributed) training.
00:38:42
In 2018 OpenAI's 5v5 Dota 2 team lost to top human players at the International, but is expected to return.
00:41:51
Geoff Hinton, a godfather of deep learning, said of backpropagation: 'throw it all away and start again.'
00:44:33

Things worth remembering

The 2012 AlexNet result is cited as the moment that proved what deep learning could do for computer vision.
00:01:40
ELMo uses bi-directional LSTMs to learn rich contextual word representations from both directions.
00:11:11
Tesla hardware version 1 used an Intel Mobileye monocular system that, as far as known, did not use a neural network, while version 2 always learns with weekly weight updates.
00:15:53
The fast.ai group trained on ImageNet to 93% accuracy in 3 hours for about 25 dollars, and reached 94% on CIFAR-10 for 26 cents.
00:27:22
Google DeepMind's BigGAN produced incredibly high-resolution images mainly by scaling model capacity and batch size rather than new ideas.
00:29:33
DeepLabv3+ is the state-of-the-art open-source semantic segmentation system on the PASCAL VOC challenge, using atrous convolutions for multi-scale processing.
00:33:53
DeepBlue beat Kasparov in the 1990s by looking as far down the game tree as possible, unlike Alpha Zero which mimics human-like minimal look-ahead.
00:39:12
The Dota 2 International competition's 2018 winning team received 11 million dollars in prize money.
00:41:19
PyTorch 1.0 came out in 2018 and TensorFlow 2.0 with eager execution was coming in 2019, standardizing deep learning frameworks.
00:44:03

Topics

deep learning natural language processing transformers BERT self-driving cars reinforcement learning AutoML GANs