Google's Tensor Processing Unit (TPU), first deployed in 2015, provides services today for more than one billion people and provides more than an order of magnitude improvement in performance and performance/W compared to contemporary platforms. Inspired by the success of the first TPU for neural network inference, Google developed multiple generations of machine learning supercomputers for neural network training that allow near linear scaling of ML workloads running on TPUv2 and TPUv3 processors. TPUs extend research frontiers and benefit a growing number of Google services.
Nishant Patil is a Tech Lead at Google focused on system architecture definition, co-design, and performance optimization for Google TPUs and other accelerator platforms. He has an MS and PhD in Electrical Engineering from Stanford University and a BS in Electrical and Computer Engineering from Carnegie Mellon University.