Fine-Grain Many-Core Processor Arrays for Efficient and High-Performance Computation

SystemX Affiliates: login to view related content.

Topic: 
Fine-Grain Many-Core Processor Arrays for Efficient and High-Performance Computation
Thursday, January 17, 2019 - 4:30pm to 5:30pm
Venue: 
Bldg. 380 Rm. 380X
Speaker: 
Prof. Bevan Baas - Electrical and Computer Engineering - UC Davis
Abstract / Description: 

The continually-growing number of devices available per chip assures the presence of many processing blocks per die communicating by some type of inter-processor interconnect. It is interesting to consider what the granularity of the processing blocks should be given a fixed amount of die area. The smallest reasonable tile size is on the order of an FPGA's LUT. Between the domains of FPGAs and traditional processors lies a lightly-explored region which we call fine-grain many-core, whose processors: can be programmed by simple traditional programs; typically operate with high throughput and high energy-efficiency; are well suited for deep submicron fabrication technologies; and are well matched to many DSP, multimedia, and embedded workloads, and--somewhat counterintuitively--also to some enterprise and scientific kernels.

The AsAP project has developed fine-grain many-core systems composed of large numbers of programmable reduced-complexity processors with no algorithm-specific hardware and with individual per-processor digitally-tunable clock oscillators operating completely independently with respect to each other (GALS). Due to the independence of the MIMD cores and individual near-optimal oscillator halting, the system operates with a power dissipation that is almost ideally proportional to the system load.

A third generation 32 nm design that integrates 1000 independent programmable processors and 12 memory modules has been designed and fabricated. The processors and memory modules communicate through a reconfigurable full-rate circuit-switched mesh network and a complementary very small area packet router, and they operate to an average maximum clock frequency of 1.78 GHz, which is believed to be the highest clock frequency achieved by a fabricated processor designed in a university. At a supply voltage of 0.9 V, processors operating at an average of 1.24 GHz dissipate 17 mW while issuing one instruction per cycle. At 0.56 V, processors operating at 115 MHz dissipate 0.61 mW resulting in 5.3 pJ/instruction, enabling 1000 100%-active cores to be powered by a single AA battery.

Several dozen DSP and general tasks have been coded plus more complex applications including: AES encryption engines, a full-rate H.264 1080p 30fps HDTV residual encoder, a fully-compliant IEEE 802.11a/11g Wi-Fi wireless LAN baseband transmitter and receiver, a SAR radar engine, a complete first-pass H.264 encoder, convolutional neural networks, large sparse matrix operations, sorting and processing of enterprise data, and others. Power, throughput, and die area results generally compare very well with solutions on existing programmable processors. A C++ compiler and automatic mapping tool greatly simplify programming.

Bio: 

Bevan Baas received M.S. and Ph.D. degrees in electrical engineering from Stanford University in 1990 and 1999 respectively.  After graduation, he joined Atheros Communications as an early employee and served as a core member of the team which developed the first IEEE 802.11a (54 Mbps, 5 GHz) Wi-Fi solution. From 1987-89, he worked on high-end minicomputers in Hewlett Packard's Computer Systems Division.  In 2003, he joined the Department of Electrical and Computer Engineering at the University of California, Davis where he is now a Professor.

Dr. Baas' research interests are in the algorithms, architectures, circuits, and VLSI for high-performance, energy-efficient, and area-efficient computation. He is interested in both programmable and special-purpose processors with an emphasis on DSP, multimedia, and embedded workloads as well as enterprise and scientific kernels.

Dr. Baas was an NSF Fellow from 1990-93 and a NASA GSR Fellow from 1993-96. He received the National Science Foundation CAREER award in 2006, the Best Paper Award at the IEEE Intl Conference on Computer Design in 2011, Best Student Paper Award 3rd place at IEEE Intl MSCS 2015, Best Student Paper Award 3rd place at IEEE Asilomar 2014, "WACIest" Best-In-Session Paper at DAC 2010, several Best Paper nominations, the Most Promising Engineer/Scientist Award by AISES in 2006, and he supervised the research that earned the College of Engineering Best Doctoral Dissertation Award Honorable Mention in 2013.  He is currently an Associate Editor for the IEEE Transactions on VLSI Systems, and an Associate Editor for the IEEE Transactions on Circuits and Systems II. From 2007–12 he was an Associate Editor for the IEEE Journal of Solid-State Circuits.