Sustainable Computing in the New World of AI

The computing industry’s energy appetite is notoriously ravenous. Currently, this results in the industry producing between 2.1 and 3.9 percent of all global carbon emissions. And that is before the true impacts of new energy-consuming applications such as generative AI (artificial intelligence) and upgraded virtual communal spaces have kicked in.

The National Science Foundation (NSF) recently announced a new $12 million multi-institutional research initiative over the next five years to delve deeply into this topic, calling for a rethinking of the infrastructure for information and communication technology (ICT), and paving the way for sustainable computing. Co-led by David Brooks of Harvard and Benjamin Lee of the University of Pennsylvania, the initiative is called NSF Expeditions in Computing. Its overarching goal is admirably high: to reduce computing’s carbon footprint by 45 percent by 2030.

Caltech’s Adam Wierman , the Carl F Braun Professor of Computing and Mathematical Sciences and director of Information Science and Technology, is part of the new NSF Expeditions team. For the past decade he has spearheaded the design of algorithms to make data centers more sustainable, and his group has worked with industry to improve standards of measurement and reporting of the carbon costs of computation.

We recently talked with him about the AI boom, how data centers could actually help the electrical grid, the new NSF Expeditions initiative, and his role in a recent startup company called Verrus that aims to build sustainable data centers.

What is the challenge that this new NSF Expeditions initiative will be taking on?

The challenge is that AI is taking off and there is this massive excitement around it. And the things underlying that are data centers, GPUs (graphics processing units), and computing. These are using enormous amounts of energy already, and all’indications are that this is going to be growing at absurd rates in the coming years. A company that adopts generative AI for something where they used to use more traditional optimization methods is probably using 10 to 20 times more carbon to do that, so if improvements aren’t made, you’re talking doubling or more in a five-year period in terms of the carbon footprint of the information technology that we do.

Already in many areas where data centers are located, like Northern Virginia or Ireland or Amsterdam, you’re talking about 20 or 30 percent of the electricity usage in those areas being devoted to data centers. And that’s before the boom of AI training is really hitting.

Utilities are reaching the point where they can no longer handle the demand. So, for example, you can’t build a data center in some regions right now because they just can’t handle the energy demand. And the big data centers that places like OpenAI are planning to build are gigawatt-scale data centers. That is as much energy as medium-sized cities use going into a single campus for training.

Can you talk about the impact of these data centers?

For a data center built in the traditional way, if you think about carbon usage, you have the carbon that goes into the manufacturing of the hardware and the building. This is called embodied carbon. And then you’ve got the operational carbon from running and doing the training. And both of those are enormous. Then additionally, beyond that, you’ve got the water usage that is running into these data centers to aid with cooling. And the water demands are also enormous.

But another aspect of all’of this is the local impact. When you look especially at new campuses that are being built in areas where there haven’t been many data centers before, you’re plopping down major energyand water-using resources in moderate-size cities. This has an impact on energy and water rates for people who live there. There’s the pollution associated with the backup generators at the data centers, which have to be run in order to do regular maintenance. And so there’s major local environmental and economic impacts from them in addition to just the global usage of carbon. These centers are also major construction projects. But beyond construction, they don’t create many jobs for the neighborhoods where they go because there’s not a lot of human needs in terms of running them once they’re built. So, there are huge challenges around their construction.

And yet they’re essential for advancing AI and all’of the improvements that come with that.

That relates to my next question about the Sidewalk Infrastructure Partners (SIP) startup company Verrus. How are you involved with that and what is the goal there?

I’m involved on the advisory board, and I was consulting with SIP as they thought about launching Verrus. Our goal is to build the most sustainable data centers operating at these huge scales.

Data centers today often are powered by large-scale backup generators and are just users of the grid and utilities, but they don’t need to be. In fact, a lot of technology already exists from our research and from many other groups like ours that have been working on this for the last decade. But it has been very hard for this technology to make its way into the data-center world until now.

All the challenges that we just talked about actually provide a huge market opportunity in the sense that utilities in many places where companies want to build data centers just won’t accept them, or at least there will be a very large interconnect queue unless a data center can help the utility meet its needs in some way. But if you can be a partner with the utility instead of a drain, then you can get access to capacity. You might be able to build in places where it would otherwise be hard.

To do that, there’s a lot of optimization work to be done-optimization of the way power works in the data center, of the way scheduling works, of the type of storage and backup power systems that you build in the center. And Verrus is going to be optimizing all’of that really at the cutting edge so that it can be a partner for utilities in a way that data centers typically are not.

Going back to the NSF Expeditions, how does the initiative plan to make ICT greener?

I’d say that a decade ago it was a hard sell to have companies pay attention to the impact of the operational carbon in what they did. Gradually, the environmental impact of data centers started to become clearer, and operationally they started to measure and understand their carbon usage. A bunch of our students were some of the first ones to build the measurement tools at a number of these companies.

Then some companies started making purchasing agreements for solar and wind power whenever they built data centers. Some had these net-zero yearly basis goals of trying to counteract the impact of their energy in the operational sense. But that doesn’t really solve the problem because you’re not feeding green energy in when you’re using it, so you’re still creating massive strain on the grid.

It’s only recently that companies have been moving toward actual 24/7 net-zero promises in terms of their operating impact. But achieving that is a huge challenge. Part of the Expedition is trying to move toward a data-center design, from the operational perspective, that can achieve that goal. That means that you need to be very responsive to the grid, to energy availability, and to moving workloads-to both shift them in time to be able to do the work when you have more energy available and do less when there’s less green energy available, but also to move them across locations so that you can do work where there’s green energy available. That’s a big algorithmic problem.

And then, as we discussed with relation to Verrus, data centers can actually help the grid. Much of the operating constraints of an electrical grid come in just a few hours a year where there’s really the peak need, like during the hottest time of the hottest summer day. That means that a data center being flexible and doing what we call "peak shaving" and reducing its loads in those moments of peak demand can help the grid itself operate much more efficiently and reduce the need to install new conventional generators. The transition from a data center as a huge load for the grid to one where it’s a resource that helps the grid operate more efficiently is a crucial one for the coming years.

On the embodied carbon side, there is a need to minimize the carbon that’s used in manufacturing the computing resources themselves. Here we are talking about your typical three R’s: reduce, reuse, recycle. In this area, there is not even an agreed-upon measure for the impact-the amount of embodied carbon that’s going into data centers, how you measure it, how you pay attention to where and when things were manufactured. Once you can measure and understand that effectively, you can start to evaluate how companies are doing and give them tools to reduce their embodied carbon for these data centers.

The ambitious goal is to actually reduce the carbon impact of ICT by 45 percent despite a massive growth in the use of computing infrastructure. And honestly, I think that’s doable if you can have a broad industry shift in reuse and recycling of hardware and operational integration with the grid.

I know that when you first came to Caltech as a computer scientist this work on sustainable computing wasn’t really on your radar. How do you feel about the work that you are doing today?

I think for me, there is a societal moment here that is just crucial. This is maybe one of the biggest challenges around energy at the moment. Because of the power of AI to help sustainability problems broadly-to design better materials, to design better storage, all’of that-we need to enable it. So we need these data centers, and we need this computing infrastructure in order to help make all’industries more sustainable. And, at the same time, we’re going to be constrained in the way we can use AI unless we can solve AI’s sustainability problem directly. So, this is a hugely important moment.

There’s a sense that what we’ve been talking about in my group and in this research area for 10 years is finally here and visible to the rest of the industry and the research community. We’ve been talking about sustainability, and we’ve been making progress along the way, but really the moment has arrived where this stuff is front and center. And it’s exciting to be able to make this impact on what is a really important problem at a really important time.