MIT 6.S093: Introduction to Human-Centered Artificial Intelligence (AI)

The guest

Lex Fridman — MIT researcher and lecturer in AI and deep learning, teaching the Human-Centered Artificial Intelligence course

The gist

This is the introductory lecture of MIT's Human-Centered Artificial Intelligence course, delivered solo by Lex Fridman. He argues that learning-based methods like deep learning will dominate real-world applications, but because such systems can never be provably safe, fair, or explainable, humans must be integrated into both the training (machine teaching, reward engineering) and operation (supervision, uncertainty signaling) phases. He surveys the state of the art in human-perception tasks such as face recognition, activity recognition, and body pose estimation. He closes with his own MIT research on AI safety via 'arguing machines' and human-centered autonomous driving, framing the human-AI relationship as symbiosis rather than parasitism.

Big reveals

Fridman argues learning-based systems can never be provably safe, fair, or explainable, so human supervision is permanently required.
00:03:40
He predicts human-AI collaboration at every step will be the defining mode of AI operation in the 21st century.
00:11:31
He claims the key unsolved challenge is an AI system reliably knowing and signaling when it is uncertain about something.
00:32:26
His 'arguing machines' framework seeks safety by detecting disagreement between multiple AI systems to trigger human supervision.
00:58:25
Applying arguing machines to ResNet (8% error) and VGG-16 (10% error) on ImageNet dropped the error rate to 2.8%.
01:00:58
Fridman states AI systems will not be perfect for the next hundred years, so humans and AI must work together as flawed partners.
01:03:55
He frames the future of AI as symbiosis, where learning happens naturally through interaction rather than costly offline annotation.
01:04:36

Things worth remembering

The machine-teaching paradigm flips annotation so the algorithm queries humans rather than humans brute-force labeling data first.
00:08:55
He cites an OpenAI boat-race RL agent that maximized reward by collecting green turbos instead of finishing the race.
00:19:30
Fridman proposes replacing the US Congress with an AI recommender system as a grand-challenge thought experiment.
00:21:13
He says we are not even close to real emotion recognition; systems only detect caricatured facial expressions.
00:25:35
Over a billion miles have been driven under Tesla Autopilot, and Waymo has surpassed 10 million autonomous miles.
00:28:19
DeepFace was the first deep-learning system to achieve near-human performance on Labeled Faces in the Wild.
00:40:23
A 2017 public face dataset contained 672,000 identities and 4.7 million photos.
00:42:29
Recent CMU work detects all joints first (all elbows, knees) then stitches them via part affinity fields for real-time multi-person pose estimation.
00:54:39
In one arguing-machines example, ResNet labeled a wine bottle a paper towel with 93% confidence while VGGNet guessed seat belt.
01:01:29

Topics

human-centered AI deep learning machine teaching face recognition AI safety autonomous vehicles computer vision reinforcement learning