Domain Specific Architectures for Deep Neural Networks: Three Generations of Tensor Processing Units (TPUs) Speaker: David Patterson (UC Berkeley/Google) The recent success of deep neural networks (DNN) has inspired a resurgence in domain specific architectures (DSAs) to run them, partially as a result of the deceleration of microprocessor performance improvement due to the ending of Moore's Law. DNNs have two phases: training, which constructs accurate models, and inference, which serves those models. Google's first generation Tensor Processing Unit (TPUv1) offered 50X improvement in performance per watt over conventional architectures for inference. We naturally asked whether a successor could do the same for training. This talk reviews TPUv1 and explores how Google built the first production DSA supercomputer for the much harder problem of training, which was deployed in 2017. |