Shape-Shifting Accelerator Chips Offer Greener Way to Meet AI Demand

''
Hyper-architectures are a more energy-efficient alternative to accelerate AI data centers and reduce costs

By Josh Baxt

 

Computer scientists at the University of California San Diego and MIT will be developing a new generation of multi-tenant, fast, efficient and environmentally friendly accelerator chips to help meet exploding demand for artificial intelligence thanks to a new grant from the National Science Foundation (NSF).

“We have been in an arms race to make these chips as fast as possible,” said UC San Diego Department of Computer Science and Engineering (CSE) professor Hadi Esmaeilzadeh, who holds the Halicioğlu Chair in Computer Architecture. “And while we have figured out how to run them fast, the problem we now face is to utilize them cost-effectively and reduce data center energy dissipation.”

Demand for neural networks and other AI services are increasing exponentially and data centers will need to keep up. The key is developing accelerator chips, such as DeepMind’s TPU and Microsoft’s Brainwaves, to handle these demands. These chips are fast, but they may not be utilized to their fullest potential.

“Data centers will account for around 14 percent of total worldwide carbon emissions,” said Esmaeilzadeh. “These centers must scale their infrastructures, but they’re already consuming too much power. The solution is to consolidate requests on the same infrastructure. Don’t just add more accelerators—utilize them more effectively.”

Shape-Shifting Chips and the People Who Make Them

The problem is existing accelerator hardware is quite rigid and is designed to only run a single tenant. In other words, each chip can accept only one job at a time, which means it’s using only  a fraction of its potential.

Last year, Esmaeilzadeh, along with CSE Ph.D. student Soroush Ghodrati and other collaborators, published a pioneering paper on spatial multi-tenancy, which outlined shape-shifting chips that adapt to greater workloads, making the chips more flexible and efficient. As demands on the hardware shift, they can be easily reorganize to virtually imitate multiple accelerators on a single chip.

These chips can reorganize themselves – in real time – to harness unused resources. At present, the team can create 65 different scenarios on one chip.

“If my chip has 2,000 ALUs (Arithmetic Logic Units), I'm not using all 2,000 at the same time,” said Esmaeilzadeh. “Some are sitting idle. If my architecture is single tenant, I can't use these resources to co-locate multiple workloads. Spatial multi-tenancy means we can run multiple workloads simultaneously in the same physical accelerator.”

The NSF grant is helping the UC San Diego/MIT team develop a complete system for next-generation data centers. To get there, Esmaeilzadeh will work closely with MIT professors Manya Ghobadi, who is a prominent researcher in high-performance networking, and Saman Amarasinghe, who is a legend in programming languages and compilers. Ultimately, the team wants to create data centers that are both faster and more efficient.

“By improving resource utilization, we are creating better scalability in response to market demand,” said Esmaeilzadeh. “Even more importantly, we are doing it while being environmentally conscious.”