MAERI 2.0 Tutorial @ ICS 2022

Overview

In this tutorial, we will educate the research community about the challenges in the emerging domain of Neural Network Accelerator, demonstrate the capabilities of MAERI 2.0 with examples and discuss ongoing development efforts.

Date

Jun 27 th, 2022, 11:00 - 12:30 PM (CET) / 14:00 PM - 15:30 PM (ET)
Virtual event.

Organizers

Tushar Krishna (Georgia Tech) Website LinkedIn

Tushar Krishna is an Associate Professor in the School of ECE at Georgia Tech since 2015. He also holds the ON Semiconductor Endowed Junior Professorship. He has a Ph.D. in Electrical Engineering and Computer Science from MIT (2014), a M.S.E in Electrical Engineering from Princeton University (2009), and a B.Tech in Electrical Engineering from the Indian Institute of Technology (IIT) Delhi (2007). Dr. Krishna’s research spans the computing stack: from circuits/physical design to microarchitecture to system software. His key focus area is in architecting the interconnection networks and communication protocols for efficient data movement within computer systems, both on-chip and in the cloud.

Jianming Tong (Georgia Tech) Website LinkedIn

Jianming Tong is a Ph.D. student at Georgia Institute of Technology starting from Spring 2021, advised by Prof. Tushar Krishna. He received my bachelor degree at Xi'an Jiaotong University at 2020 supervised by Prof. Pengju Ren. He was a Research Assistant at Tsinghua Unversity working with Prof. Yu Wang. His research interests involve on-chip network, efficient high-perform domain-specific architecture design for neural network processing and homomorphic encryption acceleratrion.

Description

Machine learning (ML) has become pervasive today. ML for science relies on leveraging Deep Neural Networks (DNNs)in scientific problems. This makes it crucial for HPC systems to run DNNs efficiently. To this end, modern HPCsystemsare increasingly deploying customized high-performance dataflow accelerators (e.g., Cerebras, Sambanova,) andFPGAsfor running next-generation DNNs. The focus of this tutorial is on demonstrating the design and deployment of a dataflowaccelerator called MAERI on FPGAs. Existing frameworks like Gemini, ESP and OpenCGRA all support end-to-end deployment of DNN modesl either throughsimulation or real FPGA. However, all of them assume a fixed underlying accelerator and could only explore mappingchoices on a fixed accelerator architecture. There is still a need for answering which hardware architecture tochooseand what parameters for the hardware is optimal for a given pattern of workloads.

As an ongoing project, we have been developing a framework, MAERI-FPGA, to deploy the software model onaflexible-template FPGA accelerator with parallelism, scheduling and hardware size as controllable knobs. Currently, MAERI-FPGA is deployed on Xilinx FPGAs (both embedded boards and PCI-E accelerator card) to support deeplearningmodel directly from Pytorch framework. To the best of our knowledge, MAERI-FPGA is the first open-sourceFPGAframework for exploring dataflow accelerator design space on real FPGA with end-to-end setup. In this tutorial, we will educate the research community about the impact of different design choices on the accelerators, demonstrate the capabilities of MAERI-FPGA with examples and discuss ongoing development efforts.

Target Audience:

The tutorial targets students, faculty, and researchers who want to

understand how to design DNN accelerators, or
study performance implications of dataflow mapping strategies, or
prototype a DNN accelerator (on ASIC of FPGA)

Schedule

Time (EDT)	Time (ET)	Agenda	Presenter	Resources
14:00	14:30	Introduction and Background of NN Accelerator	Tushar	Slide
		Overview of Neural Network Inference Workload
		Dataflow and Mapping in Neural Network
		DNN Accelerator Flexibility
14:30	14:40	break
14:40	15:10	MAERI 2.0: Flexible NN Accelerator Framework	Jianming	Slide
		Supported Neural Network Model
		Quantization Flow
		Memory Layout
		Heterogeneous Scheduling
		MAERI 2.0 Microarchitecture
15:10	15:30	Demo	Jianming	Slide
		DEMO 1: Deploy MAERI 2.0 on Xilinx Embedded FPGA Board
		DEMO 2: Explore Performance Benefit of on-chip Buffer using MAERI 2.0

Resources

Installation

Repositories

To be released, send based on the request. Please email jianming.tong@gatech.edu if you need the source code. MAERI-verilog

Core Developers

Jianming Tong (jianming.tong@gatech.edu, Georgia Tech)
Yangyu Chen (yangyuchen@gatech.edu, Georgia Tech)

Additional Contributors

Tushar Krishna (tushar@ece.gatech.edu, Georgia Tech)
Abhimanyu Bambhaniya (abambhaniya3@gatech.edu, Georgia Tech)
Taekyung Heo (taekyung@gatech.edu, Georgia Tech)
Yue Pan (ypan331@gatech.edu, Georgia Tech)