Home Lex Fridman Notes
Lex Fridman · 2016-09-27 · 1h 02m

TensorFlow Tutorial (Sherry Moore, Google Brain)

Google Brain's Sherry Moore gives a hands-on TensorFlow tutorial, building linear regression and MNIST digit-recognition models live with the audience.

TensorFlow Tutorial (Sherry Moore, Google Brain)
The guest

Sherry Moore — Software engineer on the Google Brain team who worked on TensorFlow alongside researchers, including Alex Krizhevsky who invented AlexNet.

The gist

Sherry Moore of Google Brain introduces TensorFlow, Google's open-source machine learning library that became the most popular ML project on GitHub. She explains core concepts: tensors as multi-dimensional arrays, computation graphs of connected nodes, and the modular architecture spanning front-end languages, a core execution runtime, and portable device kernels (CPU, GPU, phones, TPU). The bulk of the session is a live coding lab where the audience builds two classic models in Jupyter notebooks: a linear regression to guess a mystery line, and an MNIST handwritten-digit classifier with hidden layers. She teaches practical infrastructure including placeholders, checkpoints, savers, global step, and evaluation. The talk ends with an extended audience Q&A about C++ APIs, Windows/ARM support, TPU availability, serving, and loading custom datasets.

Big reveals

  • TensorFlow became the most popular machine learning library on GitHub with over 32,000 stars, 14,000 forks, and 8,000 contributions from 400 developers.
  • TensorFlow's flexible data flow infrastructure makes it suitable for almost any application that can fire asynchronously when data is ready, not just machine learning.
  • Over 10% of all responses sent on mobile in February were generated by Google's Smart Reply.
  • When Smart Reply was first trained, its first answer to everything was always 'I love you'.
  • Training Inception originally took about six days, and even with replicas still took about two and a half days, which is why checkpointing is critical.
  • Moore cites roughly 78.6% as the state-of-the-art accuracy benchmark she watches for when training Inception.

Things worth remembering

  • Moore sat right next to Alex Krizhevsky, the inventor of AlexNet, while developing TensorFlow.
  • In TensorFlow all data is held in a 'tensor', a multi-dimensional array similar to a numpy ND array, that flows through the graph.
  • The same TensorFlow graph can be dispatched to different device kernels: CPU, GPU, phone, or TPU.
  • TensorFlow-trained systems not only learn to play games but learn to generate game scenarios for you to play.
  • MNIST stands for the National Institute of Standards and Technology's collection of handwritten digits.
  • MNIST pixel values are normalized between 0 and 1, so hand-drawn uploads (often 0-255) must be rescaled to be recognized.
  • Google's image captioning model would label anything it didn't recognize, like a watermelon on a post, as 'man talking on a cell phone'.
  • TensorFlow could not support Windows at the time because its build tool Bazel did not yet support Windows.
  • Moore frames every model around four pieces: data, an inference (forward) graph, a training graph with loss and optimizer, and running the graph.