Turing Test: Can Machines Think?

The guest

Lex Fridman — AI researcher and host of the Lex Fridman Podcast. Here he presents solo, kicking off an AI paper reading club with Turing's foundational paper.

The gist

This is a solo lecture-style presentation, the first in Lex Fridman's AI paper reading club, walking through Alan Turing's 1950 paper 'Computing Machinery and Intelligence' and the imitation game it proposes. Lex explains the Turing test, its real-world implementations like the Loebner Prize and the Eugene Goostman claim, and Google's Meena chatbot with its sensibleness/specificity metric. He methodically covers Turing's nine objections plus Searle's Chinese Room, then surveys alternative tests including the Lovelace test, Winograd Schema Challenge, Amazon Alexa Prize, Hutter Prize compression challenge, and Francois Chollet's ARC benchmark. He concludes that natural-language conversation remains the ultimate test of intelligence and that researchers wrongly dismiss it as a distraction.

Big reveals

Lex notes the Loebner Prize is apparently no longer funded, lamenting that major labs like DeepMind and Facebook AI never took on the Turing test challenge.
00:08:50
He argues nobody has researched how to make explicit exactly which parts of a conversation reveal a bot is not human.
00:11:29
He calls out Google's Meena results as possibly PR-driven and closed-source, urging the numbers be taken with a grain of salt.
00:15:37
Lex says he goes back and forth daily on whether mimicking thinking equals thinking, mimicking consciousness equals consciousness.
00:30:11
He admits the poetic claim 'the appearance of consciousness is consciousness' reflects his real engineering stance.
00:31:13
He openly disagrees with Francois Chollet and Stuart Russell, insisting the Turing test is not a distraction but keeps researchers honest.
00:54:25
He concludes the real test of human-level intelligence will be deep, meaningful connection between human and machine via open-domain conversation.
00:55:59

Things worth remembering

Turing predicted that by 2000 a machine with 100MB of storage would fool 30% of humans in a five-minute conversation.
00:04:40
The Loebner Prize, running since 1991, offered $25,000 for text-only passing and $100,000 for multimodal.
00:07:48
In 2014 the Eugene Goostman bot fooled 33% of judges by posing as a 13-year-old Ukrainian boy with a language barrier.
00:11:29
Humans score 97% on sensibleness; on combined sensibleness and specificity humans hit 86%, Meena 79%, Mitsuku 56%.
00:14:36
The Lovelace objection from Ada Lovelace holds machines can only do what we program; Turing rephrased it as 'machines can't surprise us.'
00:21:53
The Hutter Prize compresses 1GB of Wikipedia, current best is ~117MB, paying 5,000 euros per 1% improvement.
00:41:48
Lex muses that intelligence may be the journey, not the destination, judging systems over time rather than instantaneously.
00:36:36
Francois Chollet's ARC challenge uses a colored grid world to test abstract reasoning close to human IQ tests.
00:43:22

Recommended in this episode

Books, products and media the guest or host genuinely endorsed here — with the buy link.

Affiliate link — we may earn a commission at no extra cost to you.

RecommendedBook

On the Measure of Intelligence

Francois Chollet

“here's just a couple of example of priors that Francois shows in his paper I recommend highly it called on the measure of intelligence” — Lex Fridman 00:44:57

Find it on Amazon

Topics

Turing test artificial intelligence Chinese Room consciousness chatbots machine learning benchmarks ARC challenge philosophy of mind